Thanks for your greate work.
I have a question about the final_flickr_separateGT_train.json you provide, that one caption have several annotation for same category, like this:
for training, will the dot_product of visual and bert embedding cause false negative pair?
@Haotian-Zhang @liunian-harold-li
Thanks for your greate work. I have a question about the final_flickr_separateGT_train.json you provide, that one caption have several annotation for same category, like this:
for training, will the dot_product of visual and bert embedding cause false negative pair? @Haotian-Zhang @liunian-harold-li