TRI-ML / prismatic-vlms

A flexible and efficient codebase for training visually-conditioned language models (VLMs)
MIT License
327 stars 93 forks source link

question about training data #30

Closed fyting closed 1 month ago

fyting commented 1 month ago

Is it correct that for stage1, the data used is from the 'llava-laion-cc-sbu-558k' dataset within the 'chat.json' file, and for stage2, which is a single-stage process, the data is from the 'llava-v1.5-instruct' dataset within the 'llava_v1_5_mix665k.json' file?

siddk commented 1 month ago

Yup exactly!