Hello! I have tried the Visual Instruction Tuning (for viscot-13b-336) as mentioned in readme.md, but there was a problem of CUDA out of memory. I used 8 A100GPU(80G). Does it mean that more than 8 A100 are needed for training viscot-13b-336, or is there a bug?
In addition, your readme.md mentioned that 8 A100 GPUs with 80GB memory are needed, however, your paper mentioned All models are trained using 32 × A100s. Is there any misunderstanding?
Thank you for your wonderful paper!
Hello! I have tried the Visual Instruction Tuning (for viscot-13b-336) as mentioned in readme.md, but there was a problem of CUDA out of memory. I used 8 A100GPU(80G). Does it mean that more than 8 A100 are needed for training viscot-13b-336, or is there a bug? In addition, your readme.md mentioned that 8 A100 GPUs with 80GB memory are needed, however, your paper mentioned All models are trained using 32 × A100s. Is there any misunderstanding? Thank you for your wonderful paper!