TinyLLaVA / TinyLLaVA_Factory

A Framework of Small-scale Large Multimodal Models
https://arxiv.org/abs/2402.14289
Apache License 2.0
564 stars 53 forks source link

Is it possible to pretrain tinyllama-3b on 2 V100s ? #37

Open Yang-bug-star opened 5 months ago

baichuanzhou commented 5 months ago

You can try to turn up the gradient_accumulation_steps, however, this was never tested. We recommend fintuning our published models.