Closed vinsis closed 4 years ago
Hi, @vinsis ,
From the ImageNet experiment, there is a linear projection layer between the representation and contrastive loss.
You are absolutely right! MI is not equal to similar direction. But here is how I think about it, you are maximizing mutual information between representations before the projection. The projection is similar to a reparameterization and it works in such a way that inner product could estimate MI (though the estimation is almost biased).
I am sorry I am not sure I understand what you mean by a linear projection layer between the representation and contrastive loss
. Do you mean something like the below in model specification?
self.fc8 = nn.Sequential(
nn.Linear(4096 // 2, feat_dim)
)
One more thing I noticed is that NCE learns a distribution by learning to classify samples as real or fake. In this case, a sample is dot product between v1i and v2j. In other words the classifier is trying to learn which dot products are real (aka come from the same image) and which ones are fake. We could extend it to any inner product space, not just dot product and possibly get diverse representations while preserving high MI.
I originally meant this one. But yours is also a good example.
Though I don't know which inner product you want to try, I agree there should be more possibilities.
Should I close this?
Thanks again @HobbitLong. Closing it now.
Hi, it seems that you are using the dot product between vectors from two views as a proxy for unknown distribution denoted as pd in your paper here. In other words, your hθ is the dot product. Theoretically any hθ can work so it's all good.
But doesn't it force the two representations to be similar? I understand the two representations should have high mutual information. But it is not the same as having the two vectors in similar directions.
Obviously it worked out pretty well. But do you think having a parameterized
NCEAverage
loss would have allowed for more representations with not so similar directions but still having high MI?Thank you again!