artilleryio / artillery

The complete load testing platform. Everything you need for production-grade load tests. Serverless & distributed. Load test with Playwright. Load test HTTP APIs, GraphQL, WebSocket, and more. Use any Node.js module.
https://www.artillery.io
Mozilla Public License 2.0
7.92k stars 505 forks source link

Datadog metrics not reported correctly #1569

Open meetveera opened 2 years ago

meetveera commented 2 years ago

Hi @hassy - I am running artillery tests on kubernetes as a cronjob. It runs 4 tests with different environments and tags. But after every 3-4 runs I see the metrics reported to datadog are not correct. In datadog dashboard it shoes vusers created 10 and completed 7,8,9..but when when checked the logs I can see that all the vusers were completed successfully. I am seeing this issue only for load tests with more than 3 tests running simultaneously. Could you guide me on that. Thanks a lot.

hassy commented 2 years ago

hi @meetveera 👋 that's odd, Artillery's Datadog integration sends vusers.created metric as a counter to Datadog, so you'd expect the counts to match as long as you're summing the metric over the correct time period.

meetveera commented 2 years ago

Hi @hassy - As I indicated above vusers.created is reported correctly everytime but there is a mismatch in vusers.completed along with the http.requests and http.response as well. But when check the logs all the logs are displayed correct. Looks like the metrics is not correctly reported to datadog when done load tests.

hassy commented 2 years ago

I'm not sure why that's happening, all of those metrics should be reported in the exact same way by Artillery's Datadog integration. 🤔

meetveera commented 1 year ago

Hi @hassy Good Morning:

I am still getitng the same problem, Is this something related to the below problem?

Warning: multiple batches of metrics for period 1662051730000 2022-09-01T17:02:10.000Z I see this sometimes with load tests running with a lot of concurrency

if i understand correctly it means that messages from some of the artillery workers don't get processed within a certain time window (20-30 or seconds, not entirely sure). this has no effect on the number of requests actually sent to the api you're testing, but the metrics from that worker for that 10 second period won't be reflected in artillery's report as well as in the datadog dashboard.

I tried adding after scenerio hook with think of 60 secs, and doing that has minimized the issue but I guess there has to be some sort of fix to handle such issues. `after: flow: