A repo to accelerate development and testing of GenAI Gateways built with Azure API Management. Includes various capabilities as examples/starters.
39
stars
14
forks
source link
Add initial request prioritization capability #64
Closed
stuartleeks closed 2 months ago
Purpose
Add the initial request prioritization capability. More work is planned on this but the current state is a good milestone to merge to main.
Does this introduce a breaking change?
Pull Request Type
What kind of change does this Pull Request introduce?
How to Test
Get the code Clone the repo and deploy
Test the code
Run the prioritization end-to-end script
E.g.
LOAD_PATTERN=cycle ENDPOINT_PATH=prioritization-token-counting ./scripts/run-end-to-end-prioritization.sh
What to Check
Verify that the following are valid
Other Information