Open CloudedLeopard17 opened 2 years ago
I don't think you will be able to do this on 24GB GPU. I am guessing you are using a RTX 3090? You can give it a try.
I am using 2x A5000 GPUs. I was able to train the T5 xl model using tensor-Parallelism.
Did you use megatron? Or does deepspeed has support for tensor parallel?
Deepspeed supports model parallelism (MP) to fit large models that would otherwise not fit in GPU memory.
Hi,
Thanks for the great work. I was able to do Inference on BLOOM 7.1 model on 24 GB GPU memory. Can we train the BLOOM models using tensor-Parallelism and efficient fused CUDA kernels? As I don't have access to high memory.