About the loss function and the textual feature

Hi, Thanks for your great work! But I have some problems with the loss function in your code. First, in the original paper, the author said he used the logistic regression loss function, but in your code, it seems you only calculate the positive and negative pair loss between the sentence and the image, Second, I wonder which task your code focus on, because in the original paper, it focus on the phrase grounding, however, in your code, it seems you didn't deal with the phrase in the caption, instead you treated the caption as a whole, could you give a little bit explanation about this?

BryanPlummer / two_branch_networks

About the loss function and the textual feature #2