microsoft / graphrag

A modular graph-based Retrieval-Augmented Generation (RAG) system
https://microsoft.github.io/graphrag/
MIT License
18.76k stars 1.83k forks source link

[Feature Request]: Get Final Prompts or Token Count #581

Open promentol opened 3 months ago

promentol commented 3 months ago

Is your feature request related to a problem? Please describe.

In order to bill end clients we need to calculate costs related open AI request. As there are many chat completion request to OpenAI, its not possible to estimate right now how much exactly one indexing or querying process consumed.

Describe the solution you'd like

In General for cost control / estimation purposes it will be good to have either final token count for a indexing process, or somewhere to store the final prompts (not prompt templates), which were called, both during indexing and querying (local or global)

Additional context

No response

github-actions[bot] commented 3 months ago

This issue has been marked stale due to inactivity after repo maintainer or community member responses that request more information or suggest a solution. It will be closed after five additional days.

natoverse commented 2 months ago

We are working on an injectable logger to help assess costs. In the meantime, here is a new article discussing GraphRAG costs: https://techcommunity.microsoft.com/t5/ai-azure-ai-services-blog/graphrag-costs-explained-what-you-need-to-know/ba-p/4207978

huqianghui commented 1 month ago

I have intergrate with prompt flow and graphRAG,then I can get the token and call count.

ZohebAbai commented 1 month ago

We are working on an injectable logger to help assess costs. In the meantime, here is a new article discussing GraphRAG costs: https://techcommunity.microsoft.com/t5/ai-azure-ai-services-blog/graphrag-costs-explained-what-you-need-to-know/ba-p/4207978

Hi @natoverse, any update on this? I believe it's really a good feature to have (token counts or cost estimate for each call)

sebastianschramm commented 2 weeks ago

@promentol @ZohebAbai I have implemented a simple openai request tracker: https://github.com/sebastianschramm/openai-cost-tracker

Update: just added a CLI wrapper, so now you can install my openai-cost-tracker and then just run:

track-costs graphrag.index --root foo (for indexing)

or

track-costs graphrag.query --root foo --method local "My query" (for querying)

Just call cost_tracker.init_tracker() in front of the index script (take a look at the readme for how to do that for the indexing phase: https://github.com/sebastianschramm/openai-cost-tracker/blob/main/README.md#in-code-usage). Once enabled, it will log all openai requests to file. And the repo offers a "display-costs" command to retrieve the costs per model from all requests recorded in one log file.

You can use the same init_tracker call for recording requests during querying.