krasserm / fairseq-image-captioning

Transformer-based image captioning extension for pytorch/fairseq
Apache License 2.0
314 stars 56 forks source link

Query regarding COCO dataset #18

Closed Ashwin-Ramesh2607 closed 4 years ago

Ashwin-Ramesh2607 commented 4 years ago

Hey, my doubt is regarding how the COCO dataset is used while training. The COCO homepage states that every image has atmost 5 captions. Now when you're training your repo, which caption are you using? How do we handle the fact that we have multiple captions per image?

krasserm commented 4 years ago

A given image is contained 5 times in the training set, each example with a different caption. See tokenize_captions.py for details.

Ashwin-Ramesh2607 commented 4 years ago

@krasserm Got it, understood what happens during training. However, for evaluation what do we do? Do you take the highest score among all 5 captions and evaluate based on that?