Open Shura1oplot opened 1 week ago
Hello, @Shura1oplot! I'm here to help you with any bugs, questions, or contributions you have.
To fix the TypeError: 'NoneType' object is not iterable
in token_counting.get_tokens_from_response
when CompletionResponse.raw
is None
, you need to add a check to handle the case where CompletionResponse.raw
is None
. Here is an example of how you can modify the code to include this check:
def get_tokens_from_response(response: Union["CompletionResponse", "ChatResponse"]) -> Tuple[int, int]:
"""Get the token counts from a raw response."""
raw_response = response.raw
if raw_response is None:
return 0, 0 # Handle the None case appropriately
if not isinstance(raw_response, dict):
raw_response = dict(raw_response)
usage = raw_response.get("usage", {})
if usage is None:
usage = response.additional_kwargs
if not usage:
return 0, 0
if not isinstance(usage, dict):
usage = usage.model_dump()
possible_input_keys = ("prompt_tokens", "input_tokens")
possible_output_keys = ("completion_tokens", "output_tokens")
prompt_tokens = 0
for input_key in possible_input_keys:
if input_key in usage:
prompt_tokens = usage[input_key]
break
completion_tokens = 0
for output_key in possible_output_keys:
if output_key in usage:
completion_tokens = usage[output_key]
break
return prompt_tokens, completion_tokens
This modification ensures that if response.raw
is None
, the function will return (0, 0)
to handle the None
case appropriately, thus avoiding the TypeError
[1].
To continue talking to Dosu, mention @dosu.
@Shura1oplot this is actually already fixed, but just need to make a release https://github.com/run-llama/llama_index/blob/1d4d2c653da27b7ac7d2cf53701a73ea7e5d51bb/llama-index-core/llama_index/core/callbacks/token_counting.py#L45
Recently added mypy running on core again, which caught a lot of small bugs
Bug Description
TokenCountingHandler dies trying to calculate token count (
get_tokens_from_response
) for the response produced by MockLLM.MockLLM.complete produces
CompletionResponse
with only text parameter.get_tokens_from_response
tries to convertCompletionResponse.raw
which is None to dict and dies:Possible fix: Replace
raw_response = dict(raw_response)
withVersion
0.11.3
Steps to Reproduce
llm = MockLLM(max_tokens=512) embed_model = MockEmbedding(embed_dim=3072)
query_engine = index.as_query_engine( llm=llm, embed_model=embed_model, text_qa_template=text_qa_template, # ChatPromptTemplate refine_template=refine_template) # ChatPromptTemplate
token_counter = TokenCountingHandler( tokenizer=tiktoken.encoding_for_model("gpt-3.5-turbo").encode) query_engine.callback_manager.add_handler(token_counter)
query_engine.query(prompt)
print(token_counter.completion_llm_token_count)
Relevant Logs/Tracbacks