TempleX98 / MoVA

[NeurIPS 2024] MoVA: Adapting Mixture of Vision Experts to Multimodal Context
Apache License 2.0
132 stars 2 forks source link

release the training data? #8

Open bingnoi77 opened 2 months ago

bingnoi77 commented 2 months ago

is it possible to release part of training data or construction code?

TempleX98 commented 2 months ago

Our training data is publicly available and we will provide the download link soon.

bingnoi77 commented 2 months ago

thanks for your contribution,i wonder if there is any updates?

TempleX98 commented 1 month ago

thanks for your contribution,i wonder if there is any updates?

Our SFT data encompasses:

  1. LLaVA-665k: https://github.com/haotian-liu/LLaVA
  2. DocVQA: https://www.docvqa.org/docvqa
  3. ChartQA: https://github.com/vis-nlp/ChartQA
  4. InfographicVQA: https://www.docvqa.org/datasets/infographicvqa
  5. AI2D: https://allenai.org/data/diagrams
  6. ST-VQA: https://rrc.cvc.uab.es/?ch=11&com=downloads
  7. TextVQA: https://dl.fbaipublicfiles.com/textvqa/images/train_val_images.zip
  8. SynthDoG-EN: https://huggingface.co/OpenGVLab/InternVL/resolve/main/synthdog-en-images.zip
  9. Geometry3K: https://github.com/lupantech/InterGPS
  10. PGPS9K: https://github.com/mingliangzhang2018/PGPS
  11. Geo170K: https://huggingface.co/datasets/Luckyjhg/Geo170K
  12. LLaVA-Med: https://github.com/microsoft/LLaVA-Med
  13. VQA-RAD: https://huggingface.co/datasets/flaviagiammarino/vqa-rad
  14. SLAKE: https://huggingface.co/datasets/BoKelvin/SLAKE
  15. RefCOCO grounding (REC) part of shikra: https://github.com/shikras/shikra/blob/main/docs/data.md
  16. ShareGPT4V: https://huggingface.co/datasets/Lin-Chen/ShareGPT4V
  17. A subset of ALLaVA-4V-Caption: https://huggingface.co/datasets/FreedomIntelligence/ALLaVA-4V
  18. LAION GPT4V: https://huggingface.co/datasets/laion/gpt4v-dataset
  19. TextOCR GPT4V: https://huggingface.co/datasets/jimmycarter/textocr-gpt4v

We only include the training split of each dataset.