bigcode-project / bigcode-evaluation-harness

A framework for the evaluation of autoregressive code generation language models.
Apache License 2.0
698 stars 180 forks source link

Adding support for transformers>=4.40.2 to avoid crash with mbpp #244

Closed meher-m closed 1 week ago

meher-m commented 3 weeks ago

Description

Change

Testing

Upgrading transformers package to version 4.40.2 and testing the following command succeeds.

accelerate launch main.py \
  --model ~/models/hf-code-llama-7b-instruct \
  --tasks mbpp \
  --max_length_generation 512 \
  --allow_code_execution \
  --precision bf16 \
  --do_sample False

Verify that the results of evaluation are the same as running mbpp before the package upgrade. Both provide results

"mbpp": {
    "pass@1": 0.388
}

Further, the below two files show the generations resulting from the above command before and after the package upgrade. Asserted that the generations were equivalent for each task_id.

meher-m commented 2 weeks ago

cc @loubnabnl Please let me know what you think of this change!

meher-m commented 1 week ago

@loubnabnl Yes! I have added the change.