Open sachit-menon opened 1 year ago
Hi, thanks for this great work! I noticed in your paper you mentioned you're evaluating on more multimodal datasets, like VQAv2 and OKVQA. Do you have any results for those now, or any timeline for when you might have them?
We will release the initial version of coco-captioning and vqav2 without large-scale image-text pretraining soon.
Hi, thanks for this great work! I noticed in your paper you mentioned you're evaluating on more multimodal datasets, like VQAv2 and OKVQA. Do you have any results for those now, or any timeline for when you might have them?