Azure-Samples / apim-genai-gateway-toolkit

A repo to accelerate development and testing of GenAI Gateways built with Azure API Management. Includes various capabilities as examples/starters.
MIT License
39 stars 14 forks source link

Add initial request prioritization capability #64

Closed stuartleeks closed 2 months ago

stuartleeks commented 2 months ago

Purpose

Add the initial request prioritization capability. More work is planned on this but the current state is a good milestone to merge to main.

Does this introduce a breaking change?

[ ] Yes
[X] No

Pull Request Type

What kind of change does this Pull Request introduce?

[ ] Bugfix
[X] Feature
[ ] Code style update (formatting, local variables)
[ ] Refactoring (no functional changes, no api changes)
[ ] Documentation content changes
[ ] Other... Please describe:

How to Test

E.g. LOAD_PATTERN=cycle ENDPOINT_PATH=prioritization-token-counting ./scripts/run-end-to-end-prioritization.sh

What to Check

Verify that the following are valid

Other Information