hiyouga / LLaMA-Factory

Unify Efficient Fine-Tuning of 100+ LLMs
Apache License 2.0
25.26k stars 3.13k forks source link

PPO 跑example例子报错:value should be one of int, float, str, bool, or torch.Tensor #4458

Closed xudong2019 closed 2 days ago

xudong2019 commented 4 days ago

Reminder

System Info

Reproduction

在colab上运行的 !llamafactory-cli train examples/train_lora/llama3_lora_reward.yaml # 正常结果 !llamafactory-cli train examples/train_lora/llama3_lora_ppo.yaml # 这部报错

image

Traceback (most recent call last): File "/usr/local/bin/llamafactory-cli", line 8, in sys.exit(main()) File "/content/drive/My Drive/llama-factory-new/LLaMA-Factory/src/llamafactory/cli.py", line 110, in main run_exp() File "/content/drive/My Drive/llama-factory-new/LLaMA-Factory/src/llamafactory/train/tuner.py", line 54, in run_exp run_ppo(model_args, data_args, training_args, finetuning_args, generating_args, callbacks) File "/content/drive/My Drive/llama-factory-new/LLaMA-Factory/src/llamafactory/train/ppo/workflow.py", line 58, in run_ppo ppo_trainer = CustomPPOTrainer( File "/content/drive/My Drive/llama-factory-new/LLaMA-Factory/src/llamafactory/train/ppo/trainer.py", line 118, in init PPOTrainer.init( File "/usr/local/lib/python3.10/dist-packages/trl/trainer/ppo_trainer.py", line 227, in init self.accelerator.init_trackers( File "/usr/local/lib/python3.10/dist-packages/accelerate/accelerator.py", line 685, in _inner return PartialState().on_main_process(function)(*args, *kwargs) File "/usr/local/lib/python3.10/dist-packages/accelerate/accelerator.py", line 2586, in init_trackers tracker.store_init_configuration(config) File "/usr/local/lib/python3.10/dist-packages/accelerate/tracking.py", line 79, in execute_on_main_process return PartialState().on_main_process(function)(self, args, **kwargs) File "/usr/local/lib/python3.10/dist-packages/accelerate/tracking.py", line 211, in store_init_configuration self.writer.add_hparams(values, metric_dict={}) File "/usr/local/lib/python3.10/dist-packages/torch/utils/tensorboard/writer.py", line 341, in add_hparams exp, ssi, sei = hparams(hparam_dict, metric_dict, hparam_domain_discrete) File "/usr/local/lib/python3.10/dist-packages/torch/utils/tensorboard/summary.py", line 316, in hparams raise ValueError( ValueError: value should be one of int, float, str, bool, or torch.Tensor

Expected behavior

No response

Others

No response

yblir commented 3 days ago

我也遇到了同样的问题, 把src/llamafactory/train/ppo/trainer.py中以下部分注释掉就能正常运行了,貌似这是与多GPU数据并行相关的配置参数,注释后是否有其他影响就不知道了

2024-06-26 08-39-19屏幕截图

原始报错位置如下: 2024-06-26 08-43-15屏幕截图