airsplay / lxmert

PyTorch code for EMNLP 2019 paper "LXMERT: Learning Cross-Modality Encoder Representations from Transformers".
MIT License
925 stars 157 forks source link

More than 5 captions per image for COCO data #63

Closed j-min closed 4 years ago

j-min commented 4 years ago

I thought COCO dataset has 5 captions per image. But *.json files show that some images have more than 5 captions. Is this normal?

image image

To reproduce

import json
with open('data/lxmert/mscoco_minival.json') as f:
    data = json.load(f)
for datum in data:
    coco_sents = datum['sentf']['mscoco']
    if len(coco_sents) > 5:
        print(datum['img_id'])
        print(coco_sents)
j-min commented 4 years ago

Seems like some original COCO images have more than 5 captions. Closing the issue.