Open SKBL5694 opened 1 year ago
Actually, this is a training trick I personally often adopt to balance two task loss. In my very early comparative experiments, this trick showed a slight improvement. For more details, I recommend you can reference this answer: 深度学习的多个loss如何平衡? - hzwer的回答 - 知乎 https://www.zhihu.com/question/375794498/answer/2292320194
https://github.com/xinyu1205/recognize-anything/blob/fd2ab877e245e8e571af7b2d6048d1a9d40a6408/ram/models/tag2text.py#L226 I think the result of this loss is always equal to 2*loss_t2t, which will cause the result to be irrelevant to the loss_tag. What am I doing wrong? I did think that since “(loss_tag/loss_t2t).detach()” doesn’t make sense in backward, it can be regarded as a constant, but this constant is not fixed, but can be changed, which leads to the result that is essentially only related to loss_t2t . Is this what we want? It seems that you want to realize the idea of dynamically changing the two loss scales, but is it really possible to do it in this way? Hope to get your reply, this is so confusing to me.