Add token count to track cost of each run

lavague-ai / LaVague

Large Action Model framework to develop AI Web Agents

https://docs.lavague.ai/en/latest/

Apache License 2.0

5.28k stars 470 forks source link

Add token count to track cost of each run #403

Closed dhuynh95 closed 1 month ago

dhuynh95 commented 2 months ago

We have developped a logger to make it easier to track each run.

While we keep the full prompt, inference time and so on, we do not have information about the number of tokens actually sent to the LLM (whether local or through an API).

I think it would be great to have this to help people understand how much each run costs.

This should not be too hard, as we are using LlamaIndex LLM and MultiModal LLM classes, which seem to support several observability tools. When one looks at those, for instance Arize Phoenix, it does seem that token count is recorded locally.

So technically there is something to record that, we just need to find out where, and add that to our logs.

lyie28 commented 2 months ago

Perhaps as well as this, we could also add it to the ActionResult object returned by agent.run() ?

class ActionResult:
    """Represent the result of executing an instruction"""

    instruction: str
    code: str
    success: bool
    output: Any
    total_tokens: int/float

jashuajoy commented 2 months ago

Hi @dhuynh95 @lyie28, these are 2 ways, I've found out, to get token count:

There is a 'usage' property in the raw response for every world model request. It contains all the token information. I have added demo code to look at the returned response's 'usage'.
Using 'Token Counting Handler' (doc ) provided by llama. It uses tiktoken an open-source tokenizer by OpenAI. This seems to be working better than the previous approach

Here is a Collab notebook https://colab.research.google.com/drive/1rIsWbCWpY9LvJs-IAAWf9sd4wZEl5a4k?usp=sharing

jashuajoy commented 2 months ago

If 2nd approach is better for you, I can implement it.

dhuynh95 commented 2 months ago

That's super cool! Thanks a lot @jashuajoy! I guess the second is more systematic. Do you think you could add it in the logs we have?

dhuynh95 commented 2 months ago

@lyie28 : this is a super important feature. Let's not close the issue until we also have documentation on it (might as well add the GPT cache experiment to see how many tokens we save with caching on the same page)

jashuajoy commented 2 months ago

Surely @dhuynh95 , I can add it to logs.

jashuajoy commented 2 months ago

opened pr #415. Pls check it out.

dhuynh95 commented 2 months ago

Thanks a lot for the help! We will add that ASAP to the codebase and document it

adeprez commented 2 months ago

Thanks for your contribution! The token counter in now available on main branch.

To enable it:

from lavague.core.token_counter_init import init_token_counter

agent = WebAgent(world_model, action_engine, token_counter=init_token_counter())

It brings new entries to the logs:

{"embedding_tokens": 1605}
{"llm_prompt_tokens": 1729, "llm_completion_tokens": 512, "total_llm_tokens": 2241}