Vision-CAIR / MiniGPT4-video

Official code for Goldfish model for long video understanding and MiniGPT4-video for short video understanding
https://vision-cair.github.io/Goldfish_website/
BSD 3-Clause "New" or "Revised" License
559 stars 60 forks source link

GPU resource for pretraining and instruction tuning #3

Open 2000ZRL opened 7 months ago

2000ZRL commented 7 months ago

What an excellent work! Could you please share the GPU requirement (number and memory) for pretraining and instruction tuning? Thanks.

KerolosAtef commented 7 months ago

Hello @2000ZRL Thank you for your interest in our work.

For Video text datasets: For llama2: You can use A100 with 80GB with batch size=4 or V100 with batch size=1 (Minimum GPU RAM is 32GB)

For Mistral: You can only use A100 with 80GB with batch size=1 (Minimum GPU RAM is 80 GB)

2000ZRL commented 7 months ago

Thanks for your reply? Could you please also tell me the training time cost for different model variants, e.g., llama2/mistral

moonlightian commented 2 months ago

Hello @2000ZRL Thank you for your interest in our work.

For Video text datasets: For llama2: You can use A100 with 80GB with batch size=4 or V100 with batch size=1 (Minimum GPU RAM is 32GB)

For Mistral: You can only use A100 with 80GB with batch size=1 (Minimum GPU RAM is 80 GB)

Hi, when I use multi gpus like 2 piece of L20 (single GPU RAM is 46GB), but I found it still OOM with bs=4. I wonder do we have supportted TP mode for model tuning or just DP supportted?