stanfordnlp / pyreft

ReFT: Representation Finetuning for Language Models
https://arxiv.org/abs/2404.03592
Apache License 2.0
947 stars 77 forks source link

Title: Fix: Shape Mismatch during Left Padding Adjustment in compute_metrics (Generated by Ana - AI SDE) #89

Closed ana-ai-sde closed 1 month ago

ana-ai-sde commented 1 month ago

Description:

Fix for Issue 88

This pull request addresses a RuntimeError caused by a shape mismatch during the left padding adjustment in the compute_metrics function of examples/loreft/compute_metrics.py. The issue arises when the left_padding tensor is empty, leading to an incompatible broadcasting operation with the intervention_locations tensor.

This patch was generated by Ana - AI SDE, an AI-powered software development assistant.

The fix introduces a check for the presence of elements in left_padding. If left_padding is empty, a warning message is printed, and the adjustment is skipped. This ensures the compatibility of tensor shapes and prevents the RuntimeError.

This patch improves the robustness of the compute_metrics function by handling edge cases related to left padding.

frankaging commented 1 month ago

@ana-ai-sde hey Ana, thanks for the PR, have you run any test on this change? thanks!

d4rk-lucif3r commented 1 month ago

Hi @frankaging,

I am the author of Ana - AI SDE.

Yes, we did our validations. The test results weren't added as we are still working on the pull request template.

We tried running the same commands as mentioned in Issue 88.

Command:

python train.py -task gsm8k -model /home/Meta-Llama-3-8B-Instruct-function-calling-json-mode -seed 42 -l all -r 4 -p f7+l7 -e 12 -lr 9e-4 -type NodireftIntervention -gradient_accumulation_steps 4 -batch_size 1 -eval_batch_size 1 --dropout 0.05 --test_split validation --use_normalized_template --greedy_decoding --warmup_ratio 0.00 --weight_decay 0.06

Output Without Fix:

image

Output With Fix:

image

The runtime error was fixed, the code proceeded to the next steps, and eventually training finished.

If you have any doubts or questions, feel free to ask.

Thanks,
Arsh Anwar

aryamanarora commented 1 month ago

LGTM!