l294265421 / alpaca-rlhf

Finetuning LLaMA with RLHF (Reinforcement Learning with Human Feedback) based on DeepSpeed Chat
https://88aeeb3aef5040507e.gradio.live/
MIT License
103 stars 13 forks source link

how to run it, need more details #7

Open SeekPoint opened 1 year ago

SeekPoint commented 1 year ago

and how to install alpaca-rlhf

l294265421 commented 1 year ago

and how to install alpaca-rlhf

  1. download this repo
  2. Enter ./alpaca_rlhf directory
  3. Run the step1, step2 and step3 commands in the Stey by Step section of README
SeekPoint commented 1 year ago

(gh_alpaca-rlhf) amd00@asus00:~/llm_dev/alpaca-rlhf$ (gh_alpaca-rlhf) amd00@asus00:~/llm_dev/alpaca-rlhf$ sh run.sh --num_gpus 1 ./alpaca_rlhf/deepspeed_chat/training/step1_supervised_finetuning/main.py --sft_only_data_path MultiTurnAlpaca --data_output_path ./rlhf-tmp/ --model_name_or_path ~/hf_model/llama-7b-hf --per_device_train_batch_size 2 --per_device_eval_batch_size 2 --max_seq_len 128 --learning_rate 3e-4 --num_train_epochs 1 --gradient_accumulation_steps 8 --num_warmup_steps 100 --output_dir ./rlhf/actor --lora_dim 8 --lora_module_name q_proj,k_proj --only_optimize_lora --deepspeed --zero_stage 2 start 20230602162350-------------------------------------------------- [2023-06-02 16:23:51,869] [WARNING] [runner.py:191:fetch_hostfile] Unable to find hostfile, will proceed with training with local resources only. ╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮ │ /home/amd00/anaconda3/envs/gh_alpaca-rlhf/bin/deepspeed:6 in │ │ │ │ 3 from deepspeed.launcher.runner import main │ │ 4 │ │ 5 if name == 'main': │ │ ❱ 6 │ main() │ │ 7 │ │ │ │ /home/amd00/anaconda3/envs/gh_alpaca-rlhf/lib/python3.8/site-packages/deepspeed/launcher/runner. │ │ py:407 in main │ │ │ │ 404 │ │ resource_pool = {} │ │ 405 │ │ device_count = get_accelerator().device_count() │ │ 406 │ │ if device_count == 0: │ │ ❱ 407 │ │ │ raise RuntimeError("Unable to proceed, no GPU resources available") │ │ 408 │ │ resource_pool['localhost'] = device_count │ │ 409 │ │ args.master_addr = "127.0.0.1" │ │ 410 │ │ multi_node_exec = False │ ╰──────────────────────────────────────────────────────────────────────────────────────────────────╯ RuntimeError: Unable to proceed, no GPU resources available 20230602162352 (gh_alpaca-rlhf) amd00@asus00:~/llm_dev/alpaca-rlhf$ nvidia-smi Fri Jun 2 16:24:04 2023 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 515.43.04 Driver Version: 515.43.04 CUDA Version: 11.7 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |===============================+======================+======================| | 0 NVIDIA GeForce ... On | 00000000:01:00.0 Off | N/A | | 0% 45C P8 18W / 350W | 768MiB / 24576MiB | 0% Default | | | | N/A | +-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=============================================================================| | 0 N/A N/A 1085 G /usr/lib/xorg/Xorg 4MiB | | 0 N/A N/A 1967 G /usr/lib/xorg/Xorg 4MiB | | 0 N/A N/A 259783 C ...Speed-Chat/bin/python3.10 755MiB | +-----------------------------------------------------------------------------+ (gh_alpaca-rlhf) amd00@asus00:~/llm_dev/alpaca-rlhf$

I got one 3090 and I changed gpu_nums to 1