How to use multi GPU to finetune Llama2

alibaba / FederatedScope

An easy-to-use federated learning platform

https://www.federatedscope.io

Apache License 2.0

1.26k stars 206 forks source link

How to use multi GPU to finetune Llama2 #751

Open Qwtdgh opened 7 months ago

Qwtdgh commented 7 months ago

Hi, I have a question about how to finetune Llama2 by using multi-GPU.

env: 4*A100 40G yaml: llm/vaseline/exp_yaml/dolly_lda/dolly_federate.yaml

this yaml likes as follow

use_gpu: True
device: 0
early_stop:
  patience: 0
federate:
  mode: standalone
  client_num: 3
  total_round_num: 500

only one A100 is not enough, how can I use other three GPUS to finetune my model.

I try to modify train.data_para_dids=[0, 1, 2, 3], but it is not work, i think the reason is cfg.device only specify one GPU.

Wish your reply!

rayrayraykk commented 7 months ago

data_para_dids is for data-parallel. You should use deepspeed: Please use the following configs to setup Deepspeed (for other usage, please refer to https://github.com/alibaba/FederatedScope/blob/llm/federatedscope/core/configs/cfg_llm.py):

# ---------------------------------------------------------------------- #
# Deepspeed related options
# ---------------------------------------------------------------------- #
cfg.llm.deepspeed = CN()
cfg.llm.deepspeed.use = False
cfg.llm.deepspeed.ds_config = ''  # compatible with deepspeed config

Qwtdgh commented 7 months ago

Thanks for your reply! So, I'm not need to set cfg.device? Or device is not work when deepspeed=True?