Closed Melodeiro closed 1 month ago
We have multiple different providers and its not very easy to implement pricing for all of them, Perplexica is intended to be used with local providers like Ollama, vLLM, Llama.cpp which only use your compute power.
I mislead you with the word "cost" i meant just a token count of course.
@ItzCrazyKns could you please reopen if my clarification changed your mind? I just meant sending token usage that you are getting from langchain https://js.langchain.com/v0.1/docs/modules/model_io/chat/token_usage_tracking/
Only some of the model providers return the token amount, some of them don't so it would be weird. Additionally, It is meant to be used with local models via Ollama, vLLM, etc so you don't need to worry about the token usage there.
Thats sad because this project is useful with online models as well
Would be very grateful if you add token usage information in API responses to get hint on request costs