[BUG] Incorrect Token Cost Calculations

ibehnam commented 1 month ago

I noticed the following when adding new models to Koala:

It's not easy to enter decimal numbers! Open-Router shows the costs per 1 Million tokens, while Koala asks for cost per 1 K tokens. So a $3/Million costs (e.g., for the Cohere Command R+ model) should be entered as $0.003. But . is not accepted in the text field. The workaround is to type 3, then manually go to the beginning and enter ., and repeat that to enter 0s.

Even after entering the decimal numbers like this, the total cost is not calculated correctly (I checked my usage on Open-Router after 24 hours and saw completely different numbers).

ibehnam commented 1 month ago

It seems the total costs are wrong by a factor of 6:

I reset the costs and created a new chat where I asked the model to generate a short story. The costs are almost 1/6 the total costs shown in the app settings:

jackschedel commented 1 month ago

Do you know if the model you're using uses OpenAI's tokenizer or something else? The token count is just using OpenAI's

ibehnam commented 1 month ago

Ahh, that explains it. I'm using Cohere's Command R+, which has a different tokenizer. It's probably easier to get the token usage info directly from the response:

{"id":"gen-...","model":"openai/gpt-3.5-turbo","object":"chat.completion","created":1717306340,"choices":[{"index":0,"message":{"role":"assistant","content":"The meaning of life is a deeply personal and philosophical question that has been asked by humans for centuries. There are many different beliefs and theories about the meaning of life, and it ultimately depends on an individual's own values, beliefs, and experiences. Some people find meaning in religion or spirituality, others in relationships and connections with others, and some in personal growth and self-discovery. Ultimately, the meaning of life is what each individual makes of it, and it may vary greatly from person to person."},"finish_reason":"stop"}],"system_fingerprint":null,

**"usage":{"prompt_tokens":14,"completion_tokens":100,"total_tokens":114}}**

I don't know TS, otherwise I'd open a PR myself :)

jackschedel commented 1 month ago

I suppose that sum cost could be automatically determined behind the scenes based on the response, but the estimate per chat would never be right, unless we had hardcoded logic for different model names.

This would be really annoying to implement since we'd somehow need to have different behavior per model, even though they can be custom defined. The scope of this fix just massively increased, so idk if I'll ever implement it myself.

jackschedel / KoalaClient

[BUG] Incorrect Token Cost Calculations #125