MARIO-Math-Reasoning / Super_MARIO

MIT License
203 stars 14 forks source link

Error occured at SFT training #20

Open jt4n opened 1 day ago

jt4n commented 1 day ago

Hi, I'm reproducing your job.

When I use the round3_training_data.json data to sft the deepseek-math-7b-base-value_model (after added value head), I got below error:

File "/home/workspace/LLaMA-Factory-0.6.1/LLaMA-Factory/src/llmtuner/train/sft/trainer.py", line 138, in compute_loss
    lm_logits, loss, values = model(**inputs, output_hidden_states=True, return_dict=True)
ValueError: not enough values to unpack (expected 3, got 2)

I added the modified compute_loss function you offered in this page, to the llama_factory CustomSeq2SeqTrainer class to override the original compute_loss of transformers Trainer class.

The training code is like:

CUDA_VISIBLE_DEVICES=6 python src/train_bash.py \
    --stage sft \
    --do_train \
    --model_name_or_path /home/workspace/ModelWeights/deepseek-math-7b-base-value_model \
    --dataset alpha_math_round3 \
    --template vanilla \
    --finetuning_type full \
    --output_dir /home/workspace/trained_models/alpha-math-7b-value-model-new \
    --overwrite_cache \
    --cutoff_len 1024 \
    --preprocessing_num_workers 16 \
    --per_device_train_batch_size 8 \
    --gradient_accumulation_steps 128 \
    --lr_scheduler_type cosine \
    --logging_steps 10 \
    --save_steps 1000 \
    --learning_rate 4e-5 \
    --num_train_epochs 10.0 \
    --plot_loss \
    --fp16

Did I make any mistake? How to solve it?

Chen-GX commented 1 day ago

Thank you for your interest in our work. You should invoke the AutoModelForCausalLMWithValueHead in trl for training to avoid this error. You can refer to Section Model Loader in our implementation_details.md for more details. I have updated this file, and thanks for your good question.

Chen-GX commented 1 day ago

Additionally, you should use bf16 in your training script instead of fp16 to avoid any potential errors.

jt4n commented 1 day ago

Thanks a lot! I can start the training process referring the Model Loader section.