Some question about train dataset？

mlpc-ucsd / BLIVA

(AAAI 2024) BLIVA: A Simple Multimodal LLM for Better Handling of Text-rich Visual Questions

https://arxiv.org/abs/2308.09936

BSD 3-Clause "New" or "Revised" License

257 stars 26 forks source link

Some question about train dataset？ #14

Closed shipengai closed 10 months ago

shipengai commented 10 months ago

As paper says，“Instead, it leverages a more compact 0.5M pre-training caption data following llava”。 It means that “In the first pretrain stage, the train dataset is only blip_laion_cc_sbu_558k.json?”

gordonhu608 commented 10 months ago

Yes, this is amount of data used for initial alignment.