训练reward_modeling.py

Describe the bug

Please provide a clear and concise description of what the bug is. If applicable, add screenshots to help explain your problem, especially for visualization related problems.

CUDA_VISIBLE_DEVICES=0,1 python reward_modeling.py \ --model_type auto \ --model_name_or_path Baichuan-13B-Chat \ --train_file_dir /reward \ --validation_file_dir /reward \ --per_device_train_batch_size 4 \ --per_device_eval_batch_size 4 \ --do_train \ --use_peft True \ --seed 42 \ --max_train_samples 1000 \ --max_eval_samples 10 \ --num_train_epochs 1 \ --learning_rate 2e-5 \ --warmup_ratio 0.05 \ --weight_decay 0.001 \ --logging_strategy steps \ --logging_steps 10 \ --eval_steps 50 \ --evaluation_strategy steps \ --save_steps 500 \ --save_strategy steps \ --save_total_limit 3 \ --max_source_length 256 \ --max_target_length 256 \ --output_dir outputs-rm-baichuan-v1 \ --overwrite_output_dir \ --ddp_timeout 30000 \ --logging_first_step True \ --target_modules all \ --lora_rank 8 \ --lora_alpha 16 \ --lora_dropout 0.05 \ --torch_dtype float32 \ --device_map auto \ --report_to tensorboard \ --ddp_find_unused_parameters False \ --remove_unused_columns False \ --gradient_checkpointing True

ValueError: Unrecognized configuration class <class 'transformers_modules.Baichuan-13B-Chat.configuration_baichuan.BaichuanConfig'> for this kind of AutoModel: AutoModelForSequenceClassification.

shibing624 / MedicalGPT

训练reward_modeling.py #364

Describe the bug