training data for llama adapter

I wanted to reproduce llama-adapter v2. However, I don't know how to collect the data.

https://github.com/OpenGVLab/LLaMA-Adapter From the above, I understood the model(LLaMA-Adapter V2.1 multimodal) uses Image-Text-V1 during pretraining and GPT4LLM, LLaVA, and VQAv2 during fine-tuning. But how can I get the data? Do I have to make it by myself?

I really appreciate any help you can provide.

Alpha-VLLM / LLaMA2-Accessory

training data for llama adapter #115