Closed edgan8 closed 5 months ago
Hi, I think it's better to have the generation fail so the user increases max_length to fit all the prompts, than silently truncating and getting lower scores. So you should use a larger max_length for MBPP, such as 1024
Hi, even when i use 1024 as the length, it still fails and it shows exactly the same error message.
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 accelerate launch main.py --model meta-llama/Llama-2-7b-hf --tasks mbpp --precision bf16 --max_length_generation 1024 --allow_code_execution
The error message:
ValueError: Input length of input_ids is 1024, but max_length
is set to 1024. This can lead to unexpected behavior. You should consider incr
easing max_length
or, better yet, setting max_new_tokens
.ValueError
: Input length of input_ids is 1024, but max_length
is set to 1024. This can lead to unexpected behavior. You should consider increasing ma x_length
or, better yet, setting max_new_tokens
.
How can the input length got increased with the max length?
Apparently you need max_length = 2048. This is unreasonably high, esp. since some base models may not even support such a long context.
@loubnabnl what do you think about this PR to catch the exception and turn it into a warning: https://github.com/bigcode-project/bigcode-evaluation-harness/pull/244
Yes that works, I approved the PR
Closing as the PR was merged, thanks for flagging the issue
Ever since transformers 4.38, the library will raise an exception if max_length is set to a value that doesn't include the input size. This means that bigcode will fail when running
mbpp
. However, we need to be able to set max length to replicate mbpp results effectively.For example, the following code throws an exception with transformers 4.38 but not with 4.37.2: