jackschedel / KoalaClient

The best LLM API Playground Interface (for me)
https://client.koaladev.io/
Creative Commons Zero v1.0 Universal
26 stars 8 forks source link

[BUG] Incorrect Token Cost Calculations #125

Open ibehnam opened 1 month ago

ibehnam commented 1 month ago

I noticed the following when adding new models to Koala:

  1. It's not easy to enter decimal numbers! Open-Router shows the costs per 1 Million tokens, while Koala asks for cost per 1 K tokens. So a $3/Million costs (e.g., for the Cohere Command R+ model) should be entered as $0.003. But . is not accepted in the text field. The workaround is to type 3, then manually go to the beginning and enter ., and repeat that to enter 0s.
image
  1. Even after entering the decimal numbers like this, the total cost is not calculated correctly (I checked my usage on Open-Router after 24 hours and saw completely different numbers).
ibehnam commented 1 month ago

It seems the total costs are wrong by a factor of 6:

I reset the costs and created a new chat where I asked the model to generate a short story. The costs are almost 1/6 the total costs shown in the app settings:

Screenshot 2024-06-01 at 3 53 12 PM
jackschedel commented 1 month ago

Do you know if the model you're using uses OpenAI's tokenizer or something else? The token count is just using OpenAI's

ibehnam commented 1 month ago

Ahh, that explains it. I'm using Cohere's Command R+, which has a different tokenizer. It's probably easier to get the token usage info directly from the response:

{"id":"gen-...","model":"openai/gpt-3.5-turbo","object":"chat.completion","created":1717306340,"choices":[{"index":0,"message":{"role":"assistant","content":"The meaning of life is a deeply personal and philosophical question that has been asked by humans for centuries. There are many different beliefs and theories about the meaning of life, and it ultimately depends on an individual's own values, beliefs, and experiences. Some people find meaning in religion or spirituality, others in relationships and connections with others, and some in personal growth and self-discovery. Ultimately, the meaning of life is what each individual makes of it, and it may vary greatly from person to person."},"finish_reason":"stop"}],"system_fingerprint":null,

**"usage":{"prompt_tokens":14,"completion_tokens":100,"total_tokens":114}}**

I don't know TS, otherwise I'd open a PR myself :)

jackschedel commented 1 month ago

I suppose that sum cost could be automatically determined behind the scenes based on the response, but the estimate per chat would never be right, unless we had hardcoded logic for different model names.

This would be really annoying to implement since we'd somehow need to have different behavior per model, even though they can be custom defined. The scope of this fix just massively increased, so idk if I'll ever implement it myself.