LLaVA-VL / LLaVA-NeXT

Apache License 2.0
2.4k stars 166 forks source link

[Common Issue] Check here for some common issues you might encounter #130

Open kcz358 opened 1 month ago

kcz358 commented 1 month ago

Common Issue

More Questions will be added......

Training Related

Q : Can not finetune the existing LLaVA-Onevision checkpoints A : We edit our model's config so that it is able to be served on sglang. To finetune on existing LLaVA-Onevision checkpoints, you might first need to download all the weight and change the model_type in the config.json from llava to qwen2

Q : Can not download LLaVA-NeXT data or LLaVA-Onevision Data? A : Sometimes we found out that the version of the dataset can cause issue. You can try update your dataset before performing the load_dataset.

Q : Data is not complete? A : Yes, LLaVA-NeXT data is not complete as we can not release the 15k user data. For other data such as video or multi-images, it is splited into different dataset repo on huggingface. For example, lmms-lab/M4-Instruct-Data, lmms-lab/LLaVA-ReCap-CC3M, lmms-lab/ShareGPTVideo, lmms-lab/LLaVA-ReCap-558K, lmms-lab/LLaVA-ReCap-118K

Luodian commented 2 weeks ago

Q: About video data? A: It's to be released in @ZhangYuanhan-AI next version of a more powerful video model. Currently we released the data yaml used in onevision stage at onevision.yaml.

You can checkout the three subsets video data, (1) sharegpt4video_255000.json (checkout sharegpt4video) (2) 0718_0_30_s_academic_mc_v0_1_all.json (to be released) (3) academic_source_30s_v1_all.json (to be released).