stuartleeks / aoai-simulated-api

An exploration into creating a simulated API implementation for Azure OpenAI (AOAI)
MIT License
11 stars 4 forks source link

Perf issue: Enabling telemetry increases request latency #19

Closed stuartleeks closed 4 months ago

stuartleeks commented 4 months ago

The screenshot below shows the perf results from running load tests on my laptop

With a user load of 50:

Scenario Mean response time (ms) P95 response time (ms)
Without App Insights 12ms 20-30
With App Insights 42ms 200-250

Screenshot of locust chart without App Insights: image

Screenshot of locust chart with App Insights: image

stuartleeks commented 4 months ago

Tested setting OTEL_PYTHON_DISABLED_INSTRUMENTATIONS=azure-sdk,django,fastapi,flask,psycopg2,requests,urllib,urllib3 but no difference.

Commented out FastAPIInstrumentor.instrument_app call and this looked better

With a user load of 50:

Scenario Mean response time (ms) P95 response time (ms)
Without App Insights 12ms 20-30
With App Insights 42ms 200-250
With App Insights but without FastAPIInstrumentor.instrument_app 17ms 30-100

Screenshot of locust test showing improved results with FastAPIInstrumentor.instrument_app commented out: image

stuartleeks commented 4 months ago

I initially added FastAPIInstrumentor when exploring telemetry this is currently used to provide traces, but the key telemetry at the moment is metrics, so I will remove the instrument_app call