Open mostafaelhoushi opened 4 months ago
Thanks for fixing it > This is the message i am seeing as well in the logs when ran humaneval against llama2-7b-chat-hf model:
bigcode-evaluation-harness/bigcode_eval/utils.py:361: UserWarning: An error with the following message was thrown: Input length of input_ids is 1000, but max_length
is set to 1000. This can lead to unexpected behavior. You should consider increasing max_length
or, better yet, setting max_new_tokens
.. Returning the input as the generation, for higher scores consider using a larger max_length
2024-07-23 11:50:32 EDT code_eval line: 74: [INFO] warnings.warn(f"An error with the following message was thrown: {e}. Returning the input as the generation, for higher scores consider using a larger max_length
")
Adding more details for clarity per official api doc from HF https://huggingface.co/docs/transformers/en/main_classes/text_generation
max_length (int, optional, defaults to 20) — The maximum length the generated tokens can have. Corresponds to the length of the input prompt + max_new_tokens. Its effect is overridden by max_new_tokens, if also set.
max_new_tokens (int, optional) — The maximum numbers of tokens to generate, ignoring the number of tokens in the prompt.
Thanks @kbmlcoding for approving. Still unable to merge the PR. Do we need another approval?
Cc @loubnabnl
HuggingFace's
max_length
configuration corresponds to the total length of the prompt and the generated output, whilemax_new_tokens
corresponds to the length of generated output only.Using
args.max_length_generation
to setmax_new_tokens
fixed runtime errors for me. Usingargs.max_length_generation
to setmax_length
lead to runtime errors because the total length of prompt+generation would exceed the intended value.