I think there is an error when calculating the number of input tokens at prompt-time generation in the function llmperf.utils.randomly_sample_sonnet_lines_prompt(). Line 112: line_to_add = line_to_add[: int(math.ceil(remaining_prompt_tokens))]. If I understand correctly, you want to make sure the prompt you generate is of a specific length in terms of tokens., therefore if the new line to add to the prompt is too long, you want to cut it and get rit of the extra tokens. The code checks whether this happens and then attempts to remove the extra chars from line_to_add. The way this is done seems wrong to be, because it reasons in terms of characters and not tokens, which is a mistake. The correct approach would be to count the number of tokens in line_to_add and keep only the prefix which has lenght equal to remaining_prompt_tokens in terms of tokens, and not of characters as it is now.
I think there is an error when calculating the number of input tokens at prompt-time generation in the function
llmperf.utils.randomly_sample_sonnet_lines_prompt()
. Line112
:line_to_add = line_to_add[: int(math.ceil(remaining_prompt_tokens))]
. If I understand correctly, you want to make sure the prompt you generate is of a specific length in terms of tokens., therefore if the new line to add to the prompt is too long, you want to cut it and get rit of the extra tokens. The code checks whether this happens and then attempts to remove the extra chars fromline_to_add
. The way this is done seems wrong to be, because it reasons in terms of characters and not tokens, which is a mistake. The correct approach would be to count the number of tokens inline_to_add
and keep only the prefix which has lenght equal toremaining_prompt_tokens
in terms of tokens, and not of characters as it is now.Does this make sense to you?