Hi,
Is it possible to run Llama training on 1 GPU for a test? I have tested with smaller sequence length and batch size of 1, but it seems that due to using Deepspeed in distributed_type: DEEPSPEED, it has to be a multi-node configuration. I can not find any other option other than DEEPSPEED.
Hi, Is it possible to run Llama training on 1 GPU for a test? I have tested with smaller sequence length and batch size of 1, but it seems that due to using Deepspeed in
distributed_type: DEEPSPEED
, it has to be a multi-node configuration. I can not find any other option other than DEEPSPEED.Any idea about that?