Open bfan opened 6 years ago
Sorry, I find my first question in the closed issue. However, I am still confused with the second question. Does the pytorch code of loss function only implement the non-weighted maximum likelihood when setting class_num=1? Otherwise, could you show me where is w_ij in the code?
I think in the paper, the returned loss is weighted with w_ij, and it is calculated by Equation (2).
The loss is divided by |S| to average the loss since there are |S| pairs of codes.
Thanks for your reply. Maybe I didn't ask the question clearly. I want to know whether the loss implemented in the pytorch code (loss.py) is exactly the one defined in Equation (2), or just a simplied version when w_ij=1?
We are still fixing the weight bug in pytorch version. Thus, in pytorch, we only use w_ij=1. There is some difference in parameters between caffe and pytorch.
@bfan @caozhangjie I add the weight in pytorch version(without c).
def pairwise_loss(outputs1,outputs2,label1,label2):
similarity = Variable(torch.mm(label1.data.float(), label2.data.float().t()) > 0).float()
dot_product = torch.mm(outputs1, outputs2.t())
#exp_product = torch.exp(dot_product)
mask_positive = similarity.data > 0
mask_negative = similarity.data <= 0
exp_loss = torch.log(1+torch.exp(-torch.abs(dot_product))) + torch.max(dot_product, Variable(torch.FloatTensor([0.]).cuda()))-similarity * dot_product
#weight
S1 = torch.sum(mask_positive.float())
S0 = torch.sum(mask_negative.float())
S = S0+S1
exp_loss[similarity.data > 0] = exp_loss[similarity.data > 0] * (S / S1)
exp_loss[similarity.data <= 0] = exp_loss[similarity.data <= 0] * (S / S0)
loss = torch.sum(exp_loss) / S
#exp_loss = torch.sum(torch.log(1 + exp_product) - similarity * dot_product)
return loss
Thank you for your help. @soon-will
@bfan @caozhangjie I add the weight in pytorch version(without c).
def pairwise_loss(outputs1,outputs2,label1,label2): similarity = Variable(torch.mm(label1.data.float(), label2.data.float().t()) > 0).float() dot_product = torch.mm(outputs1, outputs2.t()) #exp_product = torch.exp(dot_product) mask_positive = similarity.data > 0 mask_negative = similarity.data <= 0 exp_loss = torch.log(1+torch.exp(-torch.abs(dot_product))) + torch.max(dot_product, Variable(torch.FloatTensor([0.]).cuda()))-similarity * dot_product #weight S1 = torch.sum(mask_positive.float()) S0 = torch.sum(mask_negative.float()) S = S0+S1 exp_loss[similarity.data > 0] = exp_loss[similarity.data > 0] * (S / S1) exp_loss[similarity.data <= 0] = exp_loss[similarity.data <= 0] * (S / S0) loss = torch.sum(exp_loss) / S #exp_loss = torch.sum(torch.log(1 + exp_product) - similarity * dot_product) return loss
Hi, is it OK for Imagenet dataset? @soon-will @caozhangjie @bfan
@bfan @caozhangjie I add the weight in pytorch version(without c).
def pairwise_loss(outputs1,outputs2,label1,label2): similarity = Variable(torch.mm(label1.data.float(), label2.data.float().t()) > 0).float() dot_product = torch.mm(outputs1, outputs2.t()) #exp_product = torch.exp(dot_product) mask_positive = similarity.data > 0 mask_negative = similarity.data <= 0 exp_loss = torch.log(1+torch.exp(-torch.abs(dot_product))) + torch.max(dot_product, Variable(torch.FloatTensor([0.]).cuda()))-similarity * dot_product #weight S1 = torch.sum(mask_positive.float()) S0 = torch.sum(mask_negative.float()) S = S0+S1 exp_loss[similarity.data > 0] = exp_loss[similarity.data > 0] * (S / S1) exp_loss[similarity.data <= 0] = exp_loss[similarity.data <= 0] * (S / S0) loss = torch.sum(exp_loss) / S #exp_loss = torch.sum(torch.log(1 + exp_product) - similarity * dot_product) return loss
I'm confused about this loss function. What is the principle of exp_loss
?
exp_loss = torch.log(1+torch.exp(-torch.abs(dot_product))) + torch.max(dot_product, Variable(torch.FloatTensor([0.]).cuda()))-similarity * dot_product
Can you help me?Thank you!
这是来自QQ邮箱的假期自动回复邮件。 你好,你的邮件已收到,我会尽快给你回复。
Hi, I have difficult in understanding the pairwise loss in your pytorch code. Particularly,
I can not relate it to the Equation (4) in the paper. What is the meaning of a parameter "l_threshold" in your code?
The returned loss in the code seems to be weighted with 1/w_ij defined in the paper, i.e., Equation (2), as I find that the loss is final divided by |S|. Can you give me some explanation about this point?