BryanPlummer / two_branch_networks

Pytorch implementation of "Learning Deep Structure-Preserving Image-Text Embeddings"
MIT License
37 stars 9 forks source link

About the loss function and the textual feature #2

Open JCZ404 opened 2 years ago

JCZ404 commented 2 years ago

Hi, Thanks for your great work! But I have some problems with the loss function in your code. First, in the original paper, the author said he used the logistic regression loss function, but in your code, it seems you only calculate the positive and negative pair loss between the sentence and the image, Second, I wonder which task your code focus on, because in the original paper, it focus on the phrase grounding, however, in your code, it seems you didn't deal with the phrase in the caption, instead you treated the caption as a whole, could you give a little bit explanation about this?

ZhanYang-nwpu commented 2 years ago

Hi. Thank you for the excellent work. Could I ask you a question about your loss function? Although the code works fine, the loss value is always nan. Is there something wrong?