microsoft / azure-openai-service-proxy

The Azure AI proxy service facilitates easy access to Azure AI resources for workshops and hackathons. It offers a Playground-like interface and supports Azure AI SDKs. Access is granted through a time-limited API key and endpoint.
https://microsoft.github.io/azure-openai-service-proxy/
MIT License
63 stars 37 forks source link

Add a "usage" field in aoai.metric to record token usages #211

Closed oreh closed 7 months ago

oreh commented 7 months ago

Currently, aoai.metric only records time and deployment name. It would be useful for cost monitoring, if the server can also record the token usage for each request.

This patch updated the related table schema and procedures to include one additional field "usage" to record the token usage info. The logging call was also moved into request_manager.py, so that it executes after the usage is available.

Screenshot 2024-03-19 at 3 53 40 PM

gloveboxes commented 7 months ago

hey @oreh, thanks for the PR, I did a quick review tonight and it looks about right, I'll do a more thorough review tomorrow. Thanks, Dave

gloveboxes commented 7 months ago

Currently, aoai.metric only records time and deployment name. It would be useful for cost monitoring, if the server can also record the token usage for each request.

This patch updated the related table schema and procedures to include one additional field "usage" to record the token usage info. The logging call was also moved into request_manager.py, so that it executes after the usage is available.

Screenshot 2024-03-19 at 3 53 40 PM

Hey @oreh is there any reason the usage type in Postgres is not type JSON or JSONB? I think JSONB maybe a better option as then it becomes querable... I'll play around with ideas, but let me know if there was a reason.

Thanks, Dave

oreh commented 7 months ago

It is a better idea to use JSONB. Let me find a time today to update it.

gloveboxes commented 7 months ago

It is a better idea to use JSONB. Let me find a time today to update it.

That's ok, I'll update, I want to add token counter to the daily requests table as well as that is the most efficient for Power Bi Reporting.

Cheers Dave

oreh commented 7 months ago
Screenshot 2024-03-20 at 11 09 04 AM

Just noticed your comments after making the jsonb change :).

gloveboxes commented 7 months ago
Screenshot 2024-03-20 at 11 09 04 AM

Just noticed your comments after making the jsonb change :).

Cool - I've made a change to the attendee_metric proceedure and the attendee_request table with an additional field called total_tokens - this is a daily count of token usage - I didn't think necessary to break down further. It's more efficient for Power BI reporting and in theory we could limit daily usage by tokens...

Doing one more test, then I'll merge.

Cheers Dave

gloveboxes commented 7 months ago

@oreh ironed out a couple of issues and have merged. Thanks for the PR, much appreciated.

How do you plan to use the proxy?

Regards, Dave

oreh commented 7 months ago

First of all, thanks a lot for building this project and making it open source. We are using this project to server our internal Hackathon, running today and tomorrow :D. The patch was to 1. protect the raw access key and 2. to run post-event usage analysis.

In our case, we only deployed the 'proxy' application, and exposed it as a private service. The adminstration work was done by a direct db-management python modul using sqlalchemy. In this way, the setup was quite straightforward.

I like the token-count feature you added, as it brings in the possibility of budget control. While the cost of different types of requests varies a lot. So the actual implementation is still complicated. But we are closer to that now.