TinyLLaVA / TinyLLaVA_Factory

A Framework of Small-scale Large Multimodal Models
https://arxiv.org/abs/2402.14289
Apache License 2.0
662 stars 69 forks source link

Is it possible to pretrain tinyllama-3b on 2 V100s ? #37

Open Yang-bug-star opened 8 months ago

baichuanzhou commented 8 months ago

You can try to turn up the gradient_accumulation_steps, however, this was never tested. We recommend fintuning our published models.