Closed evelynlee999 closed 1 year ago
Are you sure that you want self.c_contrastive to be a Parameter? It will also get the gradient of the loss, unless you set requires_grad=False in the constructor. Minimizing the loss will minimize self.c_contrastive, causing it to become negative.
Yeah, I want it be a weight of InfoNce and can participate in training to get the best result. I'll give it try. Thanks a lot.
For joint loss, I set weights of infonce loss or other loss I used as hyperparameter, instead of a parameter during optimizing. In my opinion, setting hyperparameter is a good way.
Yeah, I’ve found some code use it as hyper parameter too. Thank u sooo much🥰发自我的 iPhone在 2023年6月14日,21:04,Yuntian Wang @.***> 写道: For joint loss, I set weights of infonce loss or other loss I used as hyperparameter, instead of a parameter during optimizing. In my opinion, setting hyperparameter is a good way.
—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you are subscribed to this thread.Message ID: @.***>
大佬看你名字好像中国人,呜呜呜,所以那块儿设置成超参,你说我后面那个用的infonce loss哦,能把这个当正则项吗,然后那个不当权重,当正则项系数,我这几天看代码啥的,好像正则项系数也是超参发自我的 iPhone在 2023年6月14日,21:04,Yuntian Wang @.***> 写道: For joint loss, I set weights of infonce loss or other loss I used as hyperparameter, instead of a parameter during optimizing. In my opinion, setting hyperparameter is a good way.
—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you are subscribed to this thread.Message ID: @.***>
这个想法听着可行,您可以试试看看效果。我感觉损失的设计还是要紧密结合任务,看infonce loss在您的任务中具体起什么作用。比如我是少样本的分类问题,希望通过对比学习更丰富的Latent representation,我的损失就是两个infonce和一个分类的交叉熵: total loss = ce_loss+αinfonce1+βinfonce2, α和β都是超参,调就可以了。我感觉多个损失联合优化时权重超参的设置对模型表现影响挺大的。
好的好的,谢谢!发自我的 iPhone在 2023年6月14日,21:58,Yuntian Wang @.***> 写道: 这个想法听着可行,您可以试试看看效果。我感觉损失的设计还是要紧密结合任务,看infonce loss在您的任务中具体起什么作用。比如我是少样本的分类问题,希望通过对比学习更丰富的Latent representation,我的损失就是两个infonce和一个分类的交叉熵: total loss = ce_loss+αinfonce1+βinfonce2, α和β都是超参,调就可以了。我感觉多个损失联合优化时权重超参的设置对模型表现影响挺大的。
—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you are subscribed to this thread.Message ID: @.***>
Hi,
I try to optimize the code by using InfoNCE Loss and AAM loss, and code about InfoNCE as followed which is based your code:
Joint AAM and InfoNce as followed:
The smaller loss result, the better performance. But when I ran the code, the c_contrastive always became negative, which was mean the bigger loss result the better performance. so I wonder if the code of InfoNCE I used was wrong.
I was trapped in this for a long time. Soooo looking forward to your reply: )