ray-project / llmperf

LLMPerf is a library for validating and benchmarking LLMs
Apache License 2.0
659 stars 113 forks source link

Wrong generated prompt length #75

Open PietroFerr opened 1 month ago

PietroFerr commented 1 month ago

I think there is an error when calculating the number of input tokens at prompt-time generation in the function llmperf.utils.randomly_sample_sonnet_lines_prompt(). Line 112: line_to_add = line_to_add[: int(math.ceil(remaining_prompt_tokens))]. If I understand correctly, you want to make sure the prompt you generate is of a specific length in terms of tokens., therefore if the new line to add to the prompt is too long, you want to cut it and get rit of the extra tokens. The code checks whether this happens and then attempts to remove the extra chars from line_to_add. The way this is done seems wrong to be, because it reasons in terms of characters and not tokens, which is a mistake. The correct approach would be to count the number of tokens in line_to_add and keep only the prefix which has lenght equal to remaining_prompt_tokens in terms of tokens, and not of characters as it is now.

Does this make sense to you?