Open ibehnam opened 1 month ago
It seems the total costs are wrong by a factor of 6:
I reset the costs and created a new chat where I asked the model to generate a short story. The costs are almost 1/6 the total costs shown in the app settings:
Do you know if the model you're using uses OpenAI's tokenizer or something else? The token count is just using OpenAI's
Ahh, that explains it. I'm using Cohere's Command R+, which has a different tokenizer. It's probably easier to get the token usage info directly from the response:
{"id":"gen-...","model":"openai/gpt-3.5-turbo","object":"chat.completion","created":1717306340,"choices":[{"index":0,"message":{"role":"assistant","content":"The meaning of life is a deeply personal and philosophical question that has been asked by humans for centuries. There are many different beliefs and theories about the meaning of life, and it ultimately depends on an individual's own values, beliefs, and experiences. Some people find meaning in religion or spirituality, others in relationships and connections with others, and some in personal growth and self-discovery. Ultimately, the meaning of life is what each individual makes of it, and it may vary greatly from person to person."},"finish_reason":"stop"}],"system_fingerprint":null,
**"usage":{"prompt_tokens":14,"completion_tokens":100,"total_tokens":114}}**
I don't know TS, otherwise I'd open a PR myself :)
I suppose that sum cost could be automatically determined behind the scenes based on the response, but the estimate per chat would never be right, unless we had hardcoded logic for different model names.
This would be really annoying to implement since we'd somehow need to have different behavior per model, even though they can be custom defined. The scope of this fix just massively increased, so idk if I'll ever implement it myself.
I noticed the following when adding new models to Koala:
.
is not accepted in the text field. The workaround is to type3
, then manually go to the beginning and enter.
, and repeat that to enter0
s.