Closed shiyanlou-015555 closed 2 years ago
En | De | Fr | Cs | |
30K | 30K | 30K | 30K | multi30k in github |
150K | 150K | 30K | 30K | multi30k in your paper |
can you give me the total data of Multi 30K you used? I promise they will be used only for research purposed.
Hi, Did you notice the fact that for En & De each image has 5 captions, while for Fr & Cs each image has only 1 caption?
Yes, but "https://github.com/multi30k/dataset" doesn't seem to be the case, so where can I download the full multi30k from, or can you provide a copy? I promise they will be used only for research purposed.
Hi,
I checked that I downloaded the dataset from the same link, and you can get the correct multi30k dataset after some simple preprocess.
Thank you,I will success sooner.
I have collected multi30K from "https://github.com/multi30k/dataset", but En and DE are only 30K which is different from the paper. The results for EN and DE in zero-shot are 76.6 and 76.4, which are very different from the 83.7 and 79.1 given in the paper. When I use all the data from Flickr30K as EN, the EN result is 83.4, but the DE still works poorly. According to the paper, it should be 150K DE sentences, so how can I get the total DE data? In the paper, it says that "Multi30K contains 31,783 images and provides five captions per image in English and German and one caption per image in French and Czech"? Can you give the total data of multi30K or corresponding URL