How to calculate the LLM tokens usage when do an assistant's run?

phidatahq / phidata

Build AI Agents with memory, knowledge, tools and reasoning. Chat with them using a beautiful Agent UI.

https://docs.phidata.com

Mozilla Public License 2.0

14.92k stars 2.07k forks source link

How to calculate the LLM tokens usage when do an assistant's run? #1075

Open KevinZhang19870314 opened 3 months ago

KevinZhang19870314 commented 3 months ago

I want to calculate the llm's token usage when do assistant's run, how can I do that? Do we have some callback method like LangChain?

ysolanky commented 3 months ago

Hello @KevinZhang19870314! Sharing a short example below. Let me know if you have any questions.

from phi.assistant import Assistant
from phi.llm.openai import OpenAIChat

assistant = Assistant(llm=OpenAIChat(model="gpt-4-turbo", max_tokens=500, temperature=0.3), show_tool_calls=True)

assistant.print_response("Share a quick healthy breakfast recipe.", markdown=True)

print("\n-*- Metrics:")
print(assistant.llm.metrics)

KevinZhang19870314 commented 3 months ago

@ysolanky Thank you for your response. But it just return the tokens, no costs. I think it'd be better also calculate the costs per models

{
    "response_times": [
        5.864387199981138,
        4.6080261000897735,
        3.4806884001009166,
        5.200077999848872,
        4.207325600087643,
        3.517377099953592
    ],
    "prompt_tokens": 11909,
    "completion_tokens": 1279,
    "total_tokens": 13188,
    "tool_call_times": {
        "save_to_file_and_run": [
            0.02607739996165037,
            0.010951299918815494,
            0.0673855000641197,
            0.01015629991889,
            0.017501800088211894,
            0.0120168998837471,
            0.011417800094932318,
            0.010380800114944577,
            0.012422600062564015,
            0.01328280009329319
        ]
    }
}

PS: For langchain token_usage_tracking, it also return the costs like below:

Tokens Used: 37
    Prompt Tokens: 4
    Completion Tokens: 33
Successful Requests: 1
Total Cost (USD): $7.2e-05

KevinZhang19870314 commented 3 months ago

@ysolanky I have one more question, the assistant.llm.metrics will sum all the previous run metrics for each assistant run?

ysolanky commented 3 weeks ago

Hey @KevinZhang19870314! Sorry for the late reply

PS: For langchain token_usage_tracking, it also return the costs like below:

This a great feature and something that is in our plans.

the assistant.llm.metrics will sum all the previous run metrics for each assistant run?

Yes, the token metrics get summed up for all the sums but the response times are appended to a list.

I would suggest you to please try our New Agent class in place of Assistant class. We have improved metric logging + we now return an object for each Agent run. Making it way easier to keep track of metrics for individual runs in a session