Open wu-zhonghua opened 4 years ago
@wu-zhonghua
I wrote a step-by-step tutorial on how to set up the environment, train and test this model. I also added a section on extracting the visiolinguistic embeddings from the image-text data. https://naserian-elahe.medium.com/vilbert-a-model-for-learning-joint-representations-of-image-content-and-natural-language-47f56a313a79 I very much appreciate any comments or suggestions
I have followed the data instruction to prepare the data. However, I am not able to find the caption_train.json file. Can you please tell me where I can find this file.