Open volquelme opened 5 years ago
The Visual Genome was most likely used to train the given pre-trained model as the Visual Genome provides captions which are of paragraph-length, while the MS-COCO dataset does not.
ah... i got it thanks for your reply
Thanks for your studying and sharing! I am just wondering what dataset is used for pre-trained model Which one(MS-COCO or Visual Genome) is used? if MS-COCO used, can I get the model pre-trained on visual genome dataset?