facebookresearch / vilbert-multi-task

Multi Task Vision and Language
MIT License
797 stars 180 forks source link

No caption_train.json file #49

Open wu-zhonghua opened 4 years ago

wu-zhonghua commented 4 years ago

I have followed the data instruction to prepare the data. However, I am not able to find the caption_train.json file. Can you please tell me where I can find this file.

enaserianhanzaei commented 3 years ago

@wu-zhonghua

I wrote a step-by-step tutorial on how to set up the environment, train and test this model. I also added a section on extracting the visiolinguistic embeddings from the image-text data. https://naserian-elahe.medium.com/vilbert-a-model-for-learning-joint-representations-of-image-content-and-natural-language-47f56a313a79 I very much appreciate any comments or suggestions