microsoft / Oscar

Oscar and VinVL
MIT License
1.04k stars 252 forks source link

Cannot completely download the coco caption dataset for finetuning VinVL model #165

Open yaolinli opened 3 years ago

yaolinli commented 3 years ago

I want to finetune the pretrained vinvl model on the coco captioning downstream task and follow https://github.com/microsoft/Oscar/blob/master/VinVL_DOWNLOAD.md to download the dataset. However, when I use the command path/to/azcopy copy https://biglmdiag.blob.core.windows.net/vinvl/datasets/coco_caption ./coco_caption --recursive , the "train.feature.tsv" is missing image

I can only partly download the following contents image

hasontung1999 commented 3 years ago

@yaolinli You can try getting dataset from this link: https://biglmdiag.blob.core.windows.net/oscar/datasets/coco_caption.zip If you cannot download it by using azcopy, try using !wget command in Google Colab. However, COCO misses some files in this link, too. Just download it and fill whatever files it misses.

yaolinli commented 3 years ago

@yaolinli You can try getting dataset from this link: https://biglmdiag.blob.core.windows.net/oscar/datasets/coco_caption.zip If you cannot download it by using azcopy, try using !wget command in Google Colab. However, COCO misses some files in this link, too. Just download it and fill whatever files it misses.

Thank you very much! I download the whole zip file successfully with link https://biglmdiag.blob.core.windows.net/oscar/datasets/coco_caption.zip

yaolinli commented 3 years ago

I think the dataset from link https://biglmdiag.blob.core.windows.net/oscar/datasets/coco_caption.zip may be different from the https://biglmdiag.blob.core.windows.net/vinvl/datasets/coco_caption . Because if I do inference of the released vinvl /coco_captioning_base_scst/checkpoint-15-66405 on the test set from the second link, the results are the same as what reported in the paper. But when I do inference on the test set from the first link, the results are wrong as follows: INFO:vlpretrain:evaluation result: {'Bleu_1': 0.3754352697810658, 'Bleu_2': 0.1690062414796108, 'Bleu_3': 0.08197754771485882, 'Bleu_4': 0.04221742607217998, 'METEOR': 0.09355317287051836, 'ROUGE_L': 0.30101993017675194, 'CIDEr': 0.03730488300346641, 'SPICE': 0.02254211667076489}

So I still want to know where to completely download the vinvl fine-tuning dataset( train.feature.tsv) of coco caption?