facebookresearch / SentEval

A python tool for evaluating the quality of sentence embeddings.
Other
2.09k stars 309 forks source link

ImageCaptionRetrieval - unable to reproduce results in paper #19

Closed lajanugen closed 7 years ago

lajanugen commented 7 years ago

I am unable to reproduce the results in the paper for the ImageCaptionRetrieval task.

I tried the scripts skipthought.py and infersent.py in the examples folder, setting the appropriate paths. I get very poor test scores, close to random. Here's the output for the Infersent model:

Test scores | Image to text: 0.08, 0.34, 0.84, 1058.8 Test scores | Text to image: 0.076, 0.372, 0.848, 509.0

aconneau commented 7 years ago

Hi, thanks for noticing this. That's a bug, probably introduced by the recent change to the new pytorch version (probably from this diff https://github.com/facebookresearch/SentEval/commit/91f82751add3fea2cafc8afc16dc45ef72127850#diff-c52a53e1b92c251937ead5b49999574fR49 ). I'll look into that ASAP.

lajanugen commented 7 years ago

I tried reverting back to the old version in the diff and running with pytorch 0.12. Didn't have any luck there, got a similar result..

aconneau commented 7 years ago

Sorry for taking so long to fix this. Bug introduced in https://github.com/facebookresearch/SentEval/commit/89b8ae52d7695aee9b0e730cdd196a5deb1ada6c . Now fixed.

lajanugen commented 7 years ago

Thank you for the fix!