zengyan-97 / CCLM

Cross-View Language Modeling: Towards Unified Cross-Lingual Cross-Modal Pre-training (ACL 2023))
BSD 3-Clause "New" or "Revised" License
87 stars 9 forks source link

about Flickr30K #7

Closed LiJiaBei-7 closed 1 year ago

LiJiaBei-7 commented 2 years ago

Hi,

I would like to ask the Flickr30K you used is five captions per image in English and German on multi-lingual image-text retrieval or one caption per image in English followed multi30k(https://github.com/multi30k/dataset)

zengyan-97 commented 2 years ago

Hi,

"Flickr30K: This dataset extended Flickr30K [44] from English (en) to German (de), French (fr) and Czech (cs). It contains 31,783 images and provides five captions per image in English and German, and one caption per image in French and Czech. Dataset splits are defined as the original Flickr30K"