Use Karpathy split for splitting MS-COCO dataset

krasserm / fairseq-image-captioning

Transformer-based image captioning extension for pytorch/fairseq

Apache License 2.0

312 stars 55 forks source link

Use Karpathy split for splitting MS-COCO dataset #2

Closed krasserm closed 4 years ago

krasserm commented 4 years ago

This is needed for later extensions using Faster-RCNN as feature extractor, to avoid contamination of image captioning validation and test sets with images that have been used to train the feature extractor. More details in this paper.

cstub commented 4 years ago

LGTM!