hiyouga / LLaMA-Factory

Unify Efficient Fine-Tuning of 100+ LLMs
Apache License 2.0
25.52k stars 3.16k forks source link

ppo合并失败 #4609

Closed luowei0701 closed 1 hour ago

luowei0701 commented 5 days ago

Reminder

System Info

训完ppo想和base model 合并 06/28/2024 10:37:27 - INFO - llamafactory.model.model_utils.attention - Using vanilla attention implementation. Traceback (most recent call last): File "/root/anaconda3/envs/llm/bin/llamafactory-cli", line 8, in sys.exit(main()) File "/root/workspace/project/llm/LLaMA-Factory/src/llamafactory/cli.py", line 87, in main export_model() File "/root/workspace/project/llm/LLaMA-Factory/src/llamafactory/train/tuner.py", line 73, in export_model model = load_model(tokenizer, model_args, finetuning_args) # must after fixing tokenizer to resize vocab File "/root/workspace/project/llm/LLaMA-Factory/src/llamafactory/model/loader.py", line 160, in load_model model = init_adapter(config, model, model_args, finetuning_args, is_trainable) File "/root/workspace/project/llm/LLaMA-Factory/src/llamafactory/model/adapter.py", line 311, in init_adapter model = _setup_lora_tuning( File "/root/workspace/project/llm/LLaMA-Factory/src/llamafactory/model/adapter.py", line 191, in _setup_lora_tuning model: "LoraModel" = PeftModel.from_pretrained(model, adapter, init_kwargs) File "/root/anaconda3/envs/llm/lib/python3.9/site-packages/peft/peft_model.py", line 430, in from_pretrained model.load_adapter(model_id, adapter_name, is_trainable=is_trainable, kwargs) File "/root/anaconda3/envs/llm/lib/python3.9/site-packages/peft/peft_model.py", line 984, in load_adapter adapters_weights = load_peft_weights(model_id, device=torch_device, **hf_hub_download_kwargs) File "/root/anaconda3/envs/llm/lib/python3.9/site-packages/peft/utils/save_and_load.py", line 444, in load_peft_weights adapters_weights = safe_load_file(filename, device=device) File "/root/anaconda3/envs/llm/lib/python3.9/site-packages/safetensors/torch.py", line 311, in load_file with safe_open(filename, framework="pt", device=device) as f: safetensors_rust.SafetensorError: Error while deserializing header: InvalidHeaderDeserialization

Reproduction

model

model_name_or_path: model_zoos/shenzhi-wang/Llama3-8B-Chinese-Chat adapter_name_or_path: saves/llama3-8b/lora/ppo_fdc/checkpoint-160 template: llama3 finetuning_type: lora

export

export_dir: saves/llama3-8b/lora/ppo_fdc_model export_size: 2 export_device: cpu export_legacy_format: false

Expected behavior

No response

Others

No response

luowei0701 commented 4 days ago

测试发现,是多卡环境下保存的权重有问题,但不知道为什么

LXYTSOS commented 23 hours ago

我也碰到了跟你一样的问题,跑ppo保存模型权重有问题。我在5台机器上面都用一样的代码跑了,发现能正常保存出模型权重的环境里CUDA版本是11的,保存有问题的CUDA版本是12。

luowei0701 commented 10 hours ago

我也碰到了跟你一样的问题,跑ppo保存模型权重有问题。我在5台机器上面都用一样的代码跑了,发现能正常保存出模型权重的环境里CUDA版本是11的,保存有问题的CUDA版本是12。

我这边cuda是11.8,我单卡和多卡环境是一致的呢

LXYTSOS commented 10 hours ago

PPO这个还是有点问题,有的人能跑通有的不行

hiyouga commented 1 hour ago

修复了