lucidrains / pixel-level-contrastive-learning

Implementation of Pixel-level Contrastive Learning, proposed in the paper "Propagate Yourself", in Pytorch
MIT License
252 stars 29 forks source link

The distance_matrix and similarity_matrix may not match #5

Closed lygsbw closed 3 years ago

lygsbw commented 3 years ago

Hi,

Thanks for your implementation.

As shown in the figure, we think the distance_matrix and similarity_matrix may not match each other in row and column.

12461608260848_ pic_hd

An example:

When we calculated the distance_matrix and similarity_matrix using random input and made a print as follow:

print(pdist(proj_coors_one_expanded[0].unsqueeze(0), proj_coors_two_expanded[0].unsqueeze(0))) print(pdist(proj_coors_one_expanded[1].unsqueeze(0), proj_coors_two_expanded[0].unsqueeze(0))) print(distance_matrix)

print(F.cosine_similarity(proj_pixel_one[..., 0, None],proj_pixel_two[..., None, 0])) print(F.cosine_similarity(proj_pixel_one[..., 1, None],proj_pixel_two[..., None, 0])) print(similarity_one_two)

We got:

tensor([17.1172]) tensor([85.7030]) tensor([[ 17.1172, 85.7030, 69.0290, 108.7060], [ 71.0634, 24.0416, 97.5807, 71.0634], [ 84.0238, 118.7939, 2.8284, 84.0238], [108.7060, 85.7030, 69.0290, 17.1172]])

tensor([[1.]]) tensor([[-1.0000]]) tensor([[[ 1.0000, 1.0000, 1.0000, 1.0000], [-1.0000, -1.0000, -1.0000, -1.0000], [ 1.0000, 1.0000, 1.0000, 1.0000], [ 1.0000, 1.0000, 1.0000, 1.0000]]])

Is there a problem?

lucidrains commented 3 years ago

@lygsbw Hi! Why would the distance and similarity matrix need to match each other? One is for determining which 'pixels' match between the two augmented cutouts, the other is for measuring the pixel similarity for calculating the loss

haohang96 commented 3 years ago

@lygsbw Hi! Why would the distance and similarity matrix need to match each other? One is for determining which 'pixels' match between the two augmented cutouts, the other is for measuring the pixel similarity for calculating the loss

I think that because we compute the loss_pix_one_two with similarity_one_two.masked_select(positive_mask_one_two[None, ...]), and the positive_mask_one_two is determined by distance_matrix.

lucidrains commented 3 years ago

@haohang96 yup, it's because of the pixel contrastive loss, we have the positive pixel pairs in the numerator

lucidrains commented 3 years ago

the two matrices are independent of each other, one is for determining the distance of the relative coordinates of each cutout, to then select the positive pixel pairs of the corresponding similarity matrix

maybe i misunderstood something

lucidrains commented 3 years ago

if it's an issue with language barrier, perhaps you could submit a PR to show your thoughts of how it should be in code?

haohang96 commented 3 years ago

the two matrices are independent of each other, one is for determining the distance of the relative coordinates of each cutout, to then select the positive pixel pairs of the corresponding similarity matrix

maybe i misunderstood something

I think the two matrix positive_mask_one_two and similarity_one_two are not independent. Because we need to compute loss_pix_one_two by similarity_one_two.masked_select(positive_mask_one_two). So the matrix format of positive_mask_one_two and similarity_one_two shall match each other. In addition, postive_mask_one_two is computed by positive_mask_one_two = distance_matrix < distance_thres. So the matrix format between distance_matrix and similarity shall match each other too.

I think positive_mask_one_two = distance_matrix < self.distance_thres shall be positive_mask_one_two = (distance_matrix < self.distance_thres).t()

lucidrains commented 3 years ago

the two matrices are independent of each other, one is for determining the distance of the relative coordinates of each cutout, to then select the positive pixel pairs of the corresponding similarity matrix maybe i misunderstood something

I think the two matrix positive_mask_one_two and similarity_one_two are not independent. Because we need to compute loss_pix_one_two by similarity_one_two.masked_select(positive_mask_one_two). So the matrix format of positive_mask_one_two and similarity_one_two shall match each other. In addition, postive_mask_one_two is computed by positive_mask_one_two = distance_matrix < distance_thres. So the matrix format between distance_matrix and similarity shall match each other too.

I think positive_mask_one_two = distance_matrix < self.distance_thres shall be positive_mask_one_two = (distance_matrix < self.distance_thres).t()

oh yes, I understand you now! your last line of code made everything click

I have put in a fix in 0.0.11https://github.com/lucidrains/pixel-level-contrastive-learning/commit/9ccf7e640d4ca418d2b63a67d07741f4bbb318f4

haohang96 commented 3 years ago

the two matrices are independent of each other, one is for determining the distance of the relative coordinates of each cutout, to then select the positive pixel pairs of the corresponding similarity matrix maybe i misunderstood something

I think the two matrix positive_mask_one_two and similarity_one_two are not independent. Because we need to compute loss_pix_one_two by similarity_one_two.masked_select(positive_mask_one_two). So the matrix format of positive_mask_one_two and similarity_one_two shall match each other. In addition, postive_mask_one_two is computed by positive_mask_one_two = distance_matrix < distance_thres. So the matrix format between distance_matrix and similarity shall match each other too. I think positive_mask_one_two = distance_matrix < self.distance_thres shall be positive_mask_one_two = (distance_matrix < self.distance_thres).t()

oh yes, I understand you now! your last line of code made everything click

I have put in a fix in 0.0.119ccf7e6

I am sorry that my English is too poor that make you misunderstand my point before. Thanks for your patient reply!

lucidrains commented 3 years ago

@haohang96 thank you for your persistence!