dvlab-research / LLaMA-VID

Official Implementation for LLaMA-VID: An Image is Worth 2 Tokens in Large Language Models
Apache License 2.0
623 stars 40 forks source link

Traing cost #12

Closed ds22058 closed 6 months ago

ds22058 commented 6 months ago

Hi!How much time did your training on 8 A100 GPUs with 80GB memory for "A," "B," and "C" respectively take?

wcy1122 commented 6 months ago

Hello. Does the "A," "B," and "C" mean three training stages? To train a 7B model on 8 A100 GPUs with 80GB memory, the first stage takes around 9 hours, the second stage takes around 30 hours, and the third stage takes around 12 hours.

tyleryzhu commented 4 months ago

Hi @wcy1122, I was wondering if you could share the wandb logs for each stage so we can compare loss curves for reproduction? Thanks!