haotian-liu / LLaVA

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
https://llava.hliu.cc
Apache License 2.0
19.77k stars 2.17k forks source link

[Question] What are the differences between two versions of pretrain datasets? #662

Open aprilehannibal opened 11 months ago

aprilehannibal commented 11 months ago

Question

Great job! I found there are two versions pretrain datasets: blip_laion_cc_sbu_558k and LLaVA-CC3M-Pretrain-595K. I'd like to know what are the differences between them and which one is better. Did you analyze the quality of these datasets and why they have different performance? Hope for your reply. Thanks a lot! @haotian-liu

Hambaobao commented 9 months ago

I also want to know the difference between the two of them, have you figure it out?

wkml commented 2 months ago

same :( seens it still not be sloved