Closed mmmmmmrluo closed 3 years ago
Hi.
1) The positive samples are in the 0th index of the logits. So labels is just a list of all 0s. 2) The vectors are first normalized and then the dot product is taken, which gives the cosine angle.
Hi. Thanks for your answer. I've learned. Now I wonder if it's feasible to replace cosine similarity with other distance or similarity measures. Looking forward to your reply.
------------------ 原始邮件 ------------------ 发件人: "RElbers/info-nce-pytorch" @.>; 发送时间: 2021年8月19日(星期四) 凌晨3:49 @.>; @.**@.>; 主题: Re: [RElbers/info-nce-pytorch] i have one question for the part of code (#2)
Hi.
The positive samples are in the 0th index of the logits. So labels is just a list of all 0s.
The vectors are first normalized and then the dot product is taken, which gives the cosine angle.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android.
In theory that should be possible. You just need a measure which gives low values for positive pairs and a high values for negative pairs.
Hi. The Info_NCE formula is obtained by logSoftMax + nllLoss, i.e. nn.crossEntropyLoss(), and it is a positive value. In order to minimize the loss, shouldn't we maximize the softmax result? Shouldn't make softMax's molecules, namely positive sample pairs, be larger and negative sample pairs be smaller? Isn't that contrary to our intention? I don't understand this point, I hope you can give me some advice
------------------ 原始邮件 ------------------ 发件人: "RElbers/info-nce-pytorch" @.>; 发送时间: 2021年8月20日(星期五) 凌晨1:44 @.>; @.**@.>; 主题: Re: [RElbers/info-nce-pytorch] i have one question for the part of code (#2)
In theory that should be possible. You just need a measure which gives low values for positive pairs and a high values for negative pairs.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android.
Is it because we want to maximize the mutual information between the pairs of positive samples so you need to maximize the density ratio, and then the molecular dot product is proportional to the density ratio, so you need to maximize the molecular dot product?
------------------ 原始邮件 ------------------ 发件人: "RElbers/info-nce-pytorch" @.>; 发送时间: 2021年8月20日(星期五) 凌晨1:44 @.>; @.**@.>; 主题: Re: [RElbers/info-nce-pytorch] i have one question for the part of code (#2)
In theory that should be possible. You just need a measure which gives low values for positive pairs and a high values for negative pairs.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android.
Sorry, what I said in my previous comment was wrong. We want high values (similarity) between positive pairs and low values for negative pairs. And to optimize this, we can simple use the categorical cross entropy.
thank you very much, i understand, in this case, the normalized inner product of the vector represents cosine similarity, and the larger the inner product, the higher the similarity,, which makes logical sense.
------------------ 原始邮件 ------------------ 发件人: "RElbers/info-nce-pytorch" @.>; 发送时间: 2021年8月20日(星期五) 下午4:36 @.>; @.**@.>; 主题: Re: [RElbers/info-nce-pytorch] i have one question for the part of code (#2)
Sorry, what I said in my previous comment was wrong. We want high values (similarity) between positive pairs and low values for negative pairs. And to optimize this, we can simple use the categorical cross entropy.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android.
if negative_keys is not None:
Explicit negative keys
1)why the labels all are zero, Shouldn't there be a positive sample pairs labeled 1? 2)Is this cosine similarity? It should be just inner product?