zdaxie / PixPro

Propagate Yourself: Exploring Pixel-Level Consistency for Unsupervised Visual Representation Learning, CVPR 2021
https://arxiv.org/abs/2011.10043
MIT License
332 stars 40 forks source link

What is max_bin_diag? #1

Closed WHlTE-N0lSE closed 3 years ago

WHlTE-N0lSE commented 3 years ago

Thanks for open source. What is the function of max_bin_diag?

zdaxie commented 3 years ago

Thanks for your attention to our work. As for max_bin_diag you mentioned, I think what you are referring to is the variable here.

Since we have two feature maps generated from two different views, the original view sizes corresponding to these two feature maps are not consistent. Therefore, we need to first map the coordinates of the feature map back to the original image coordinates, and then use the diagonal length of the feature map bin to normalize and determine the positive and negative samples accordingly. But we have two different diagonal lengths, so in the implementation we use the larger one of two diagonal lengths, which is the max_bin_diag you mentioned. For details on how to determine the positive and negative samples, please refer to the complete function and Sec3.1, subsection - Pixel Contrast of the paper.