TinyLLaVA / TinyLLaVA_Factory

A Framework of Small-scale Large Multimodal Models
https://arxiv.org/abs/2402.14289
Apache License 2.0
662 stars 69 forks source link

Compute budget for TinyLLaVA? #45

Closed ryan-caesar-ramos closed 7 months ago

ryan-caesar-ramos commented 7 months ago

Hi! Can I ask

  1. What GPUs were used in training these models? Were they A100s?
  2. How many of these GPUs were used?
  3. How many hours did it take to finish training?

Thank you!

baichuanzhou commented 7 months ago
  1. Models were trained on A100(40GB)s.
  2. It took 8 of A100s.
  3. For the largest model, it took approximately 12 hours.
ryan-caesar-ramos commented 7 months ago

Thanks for the reply! Can I ask if the 12 hours was for everything i.e. the pre-training and the fine-tuning all competed within a total of 12 hours?

baichuanzhou commented 7 months ago

This issue is marked as closed as questions have been answered through emails.

Arnav0400 commented 3 months ago

@baichuanzhou thanks for your work! can you please let me know the training times for pre-training and fine-tuning?