Closed LZY-the-boys closed 9 months ago
Hi,
Thanks for your interest in our work!
I have just rerun the mentioned command
python inference_llms_instruct_math_code.py --dataset_name gsm8k --finetuned_model_name WizardMath-7B-V1.0 --tensor_parallel_size 1 --weight_mask_rate 0.9 --use_weight_rescale
and it works well for me. I got an accuracy of 50.42.
To identify the issues, could you please run
python inference_llms_instruct_math_code.py --dataset_name gsm8k --finetuned_model_name WizardMath-7B-V1.0 --tensor_parallel_size 1 --weight_mask_rate 0.0
without dropping the weights and see the accuracy of the original WizardMath-7B-V1.0 model? I got 55.34 accuracy and you can compare with this result to ensure your inference process is right.
Hi,
Thanks for your interest in our work!
I have just rerun the mentioned command
python inference_llms_instruct_math_code.py --dataset_name gsm8k --finetuned_model_name WizardMath-7B-V1.0 --tensor_parallel_size 1 --weight_mask_rate 0.9 --use_weight_rescale
and it works well for me. I got an accuracy of 50.42.To identify the issues, could you please run
python inference_llms_instruct_math_code.py --dataset_name gsm8k --finetuned_model_name WizardMath-7B-V1.0 --tensor_parallel_size 1 --weight_mask_rate 0.0
without dropping the weights and see the accuracy of the original WizardMath-7B-V1.0 model? I got 55.34 accuracy and you can compare with this result to ensure your inference process is right.
Thanks for you help! I haved ran the --weight_mask_rate 0.0
and get acc=0.5534495830174374
. However, I just cannot make --weight_mask_rate 0.9
right, whether with rescale or not.
Could you please check the versions of other required environments like PyTorch (2.0.1) and transformers (4.33.1)? The mentioned problem is a bit strange as --weight_mask_rate 0.9
works for me.
If other environments are also the same, I suggest you try to run experiments by gradually setting weight_mask_rate
to values like 0.1, 0.4, 0.7, and 0.9. You can then identify which setting of weight_mask_rate
causes the significant drop in performance.
Please feel free to ask when you finish running the above experiments.
Close this issue now.
Please feel free to reopen it when there are any further questions.
or
generated texts are all
''
,use vllm==0.1.4
I currently debug the code and find that it may caused by
temperature=0.0
(greedy decoding). So I increse thetemperature
to 0.01, get the crushed output:Can you help me to figure out this ?