Trained features? - Githubissues

microsoft / Oscar

Oscar and VinVL

MIT License

1.04k stars 252 forks source link

For image captioning on COCO, I am trying to obtain an image features from a trained model instead of generating the caption. In DOWNLOAD.md, under Datasets, are the image region features (e.g., train.feature.tsv) extracted before or after training the model on downstream tasks (e.g., image captioning on COCO)? If before, how can I obtain an image features from a trained model? One more question: in MODEL_ZOO.md, under Image Captioning on COCO, is the Model checkpoint: checkpoint.zip trained and finetuned? or we still need to train with cross-entropy loss and finetune with CIDEr optimization?

microsoft / Oscar

Trained features? #41