Open 2000ZRL opened 2 months ago
Hello @2000ZRL Thank you for your interest in our work.
For Video text datasets: For llama2: You can use A100 with 80GB with batch size=4 or V100 with batch size=1 (Minimum GPU RAM is 32GB)
For Mistral: You can only use A100 with 80GB with batch size=1 (Minimum GPU RAM is 80 GB)
Thanks for your reply? Could you please also tell me the training time cost for different model variants, e.g., llama2/mistral
What an excellent work! Could you please share the GPU requirement (number and memory) for pretraining and instruction tuning? Thanks.