Open yhsparrow opened 2 weeks ago
@yhsparrow, Thanks for flagging this issue and diving deep into the challenges you're hitting with telemetry data in Application Insights during your load tests. Really appreciate the detailed rundown!
I wanted to highlight that Azure Application Insights does implement throttling mechanisms to manage the volume of telemetry data being sent to ensure the service remains performant and cost-effective. Adaptive sampling is enabled by default in all the latest versions of the Application Insights. ref: https://learn.microsoft.com/azure/azure-monitor/app/sampling-classic-api
But as you highlighted, I understand the need for more detailed control over telemetry throttling and sampling, especially when fr something like load testing or trying to benchmark performance.
We will review the same. While I cannot provide an immediate timeline for this feature's potential release, we will keep this thread updated with any new changes.
If this were configurable, throttling at the AI level still could result in the same result. This is worth investigating to see if a configuration setting (or settings) could mitigate this or if we should address testing scenarios or high-volume scenarios through different sinks instead (just one idea). We don't want to solve an issue by introducing two more.
@yhsparrow , It would really help us if you could add some screenshot as well. Also, were you seeing around 4req/sec on the performance tab. Also, could you also check in the Metrics Tab under monitoring and tell us what you see?
What happened?
Issue Description: While conducting load testing on an API utilizing the DataApiBuilder (DAB), it was observed that the telemetry data available in Live Metrics and other Application Insights tools only reflects a fraction of the actual traffic. Specifically, during a test simulating 1000 requests per second, Live Metrics reported processing only around 4 requests per second. This discrepancy is believed to be due to the current telemetry throttling mechanism, which significantly samples or throttles the telemetry data, thus not providing a true picture of the system's performance under load.
Impact: This issue makes it challenging to accurately monitor and assess the application's performance and health during high-load scenarios, which is critical for capacity planning, performance tuning, and ensuring the reliability of the service.
Steps to Reproduce:
Expected Behavior: The telemetry data in Application Insights should accurately reflect the load generated by the test, allowing for real-time monitoring and analysis of the application's performance under stress.
Actual Behavior: The telemetry data significantly underreports the actual traffic, indicating only about 4 requests per second in Live Metrics, due to aggressive telemetry sampling or throttling.
Feature Request: I propose the introduction of a feature or configuration option within DAB that allows users to adjust the level of telemetry throttling or sampling, especially for scenarios requiring precise monitoring and diagnostics, such as load testing or performance benchmarking. This feature would enable developers and system administrators to get a more accurate picture of the application's behavior under various load conditions, improving the observability and manageability of applications built with DAB.
Potential Benefits:
Thank you for considering this feature request. I believe it would significantly enhance the utility and flexibility of monitoring, performance testing and benchmarking for DAB, especially for applications with high throughput demands.
Version
1.1.7
What database are you using?
Azure SQL
What hosting model are you using?
Container Apps
Which API approach are you accessing DAB through?
REST
Relevant log output
No response
Code of Conduct