Open KevinZhang19870314 opened 3 months ago
Hello @KevinZhang19870314! Sharing a short example below. Let me know if you have any questions.
from phi.assistant import Assistant
from phi.llm.openai import OpenAIChat
assistant = Assistant(llm=OpenAIChat(model="gpt-4-turbo", max_tokens=500, temperature=0.3), show_tool_calls=True)
assistant.print_response("Share a quick healthy breakfast recipe.", markdown=True)
print("\n-*- Metrics:")
print(assistant.llm.metrics)
@ysolanky Thank you for your response. But it just return the tokens, no costs. I think it'd be better also calculate the costs per models
{
"response_times": [
5.864387199981138,
4.6080261000897735,
3.4806884001009166,
5.200077999848872,
4.207325600087643,
3.517377099953592
],
"prompt_tokens": 11909,
"completion_tokens": 1279,
"total_tokens": 13188,
"tool_call_times": {
"save_to_file_and_run": [
0.02607739996165037,
0.010951299918815494,
0.0673855000641197,
0.01015629991889,
0.017501800088211894,
0.0120168998837471,
0.011417800094932318,
0.010380800114944577,
0.012422600062564015,
0.01328280009329319
]
}
}
PS: For langchain token_usage_tracking, it also return the costs like below:
Tokens Used: 37
Prompt Tokens: 4
Completion Tokens: 33
Successful Requests: 1
Total Cost (USD): $7.2e-05
@ysolanky I have one more question, the assistant.llm.metrics
will sum all the previous run metrics for each assistant run?
Hey @KevinZhang19870314! Sorry for the late reply
PS: For langchain token_usage_tracking, it also return the costs like below:
This a great feature and something that is in our plans.
the assistant.llm.metrics will sum all the previous run metrics for each assistant run?
Yes, the token metrics get summed up for all the sums but the response times are appended to a list.
I would suggest you to please try our New Agent class in place of Assistant class. We have improved metric logging + we now return an object for each Agent run. Making it way easier to keep track of metrics for individual runs in a session
I want to calculate the llm's token usage when do assistant's run, how can I do that? Do we have some callback method like LangChain?