Closed linhaojia13 closed 5 days ago
The paper seems to indicate that you converted Slimpajama to PDF to construct the data. Is that correct?
No. In the long-context training stage, we train on text from Slimpajama only. But converting Slimpajam to PDF is an interesting idea. We've also thought about it but have not tried it yet.
After pretraining, what instruction data are used for fine-tuning LongVA
Same as Llava-1.6
When do you plan to release the instruction tuning code
Thank you very much!
This work is fantastic and has been very inspiring to me. I have a few questions: