[Chatllama]: How to train llama-7B with multiple GPU?

bnuzhanyu commented 1 year ago

I downloaded the llama-7B model which MP=1. I modified the config: actor_config: device: "cuda:1,2,5,7" model: "llama-7B"

I tried: torchrun --nproc_per_node=4 artifacts/main.py artifacts/config/config.yaml --type ACTOR and get: AssertionError: Loading a checkpoint for MP=1 but world size is 4.

I tried: python artifacts/main.py artifacts/config/config.yaml It seems it use cuda:0, and out of cuda memory.

So, I have two questions:

how can I train models with multiple GPU?
Is llama-7B can be trained on multiple GPU?

bimalm commented 1 year ago

https://github.com/facebookresearch/llama/issues/55

PierpaoloSorbellini commented 1 year ago

Hi @bnuzhanyu @bimalm Vanilla LLaMA it is only for inference. We have reimplemented it to make it suitable for training. We are working on stabilising the distributed training, we will keep you updated when a new stable release is available.

nebuly-ai / optimate

[Chatllama]: How to train llama-7B with multiple GPU? #252