Open MarkusGutjahr opened 1 month ago
There is the file graphrag/llm/base/rate_limiting.py where it invokes the LLM and provides the input and output tokens used. I summed them up before the _handle_invoke_result call and logged them. After you have the input and output tokens you can compute the cost based on your particular model.
the class RateLimitingLLM isn't getting called in the idexing process
@MarkusGutjahr I have implemented a simple openai request tracker: https://github.com/sebastianschramm/openai-cost-tracker
Update: just added a CLI wrapper, so now you can install my openai-cost-tracker and then just run:
track-costs graphrag.index --root foo
(for indexing)
or
track-costs graphrag.query --root foo --method local "My query"
(for querying)
Just call cost_tracker.init_tracker()
in front of the index script (take a look at the readme for how to do that for the indexing phase: https://github.com/sebastianschramm/openai-cost-tracker/blob/main/README.md#in-code-usage). Once enabled, it will log all openai requests to file. And the repo offers a "display-costs" command to retrieve the costs per model from all requests recorded in one log file.
Do you need to file an issue?
Is your feature request related to a problem? Please describe.
I'm able to track the token consumption for querying, but i don't know how to track the cost of the indexing process.
Describe the solution you'd like
Any way to track the indexing cost, workarounds would also be fine.
Additional context
No response