Difference in the e2e_avg latency observed in the benchmarking tool and Azure portal

Azure / azure-openai-benchmark

Azure OpenAI benchmarking tool

MIT License

130 stars 54 forks source link

We are trying to run our benchmarking exercise using the benchmarking tool for gpt4o, but getting different e2e_avg latency reported on the benchmarking tool and the Azure portal and e2e_avg latency reported on the benchmarking tool is atleast twice of that reported on Azure portal.

Command used: python -m benchmark.bench load --temperature 0.0 --shape-profile custom --deployment 'deployment name' --max-tokens 200 --context-tokens 20000 --api-version 2024-02-01 --rate 10 --duration 600 https://genai-stg-westus3-1.openai.azure.com/

Azure / azure-openai-benchmark

Difference in the e2e_avg latency observed in the benchmarking tool and Azure portal #53