Results on more multimodal datasets

OpenGVLab / LLaMA-Adapter

[ICLR 2024] Fine-tuning LLaMA to follow Instructions within 1 Hour and 1.2M Parameters

GNU General Public License v3.0

5.63k stars 367 forks source link

Results on more multimodal datasets #10

Open sachit-menon opened 1 year ago

sachit-menon commented 1 year ago

Hi, thanks for this great work! I noticed in your paper you mentioned you're evaluating on more multimodal datasets, like VQAv2 and OKVQA. Do you have any results for those now, or any timeline for when you might have them?

gaopengpjlab commented 1 year ago

We will release the initial version of coco-captioning and vqav2 without large-scale image-text pretraining soon.