Open pnomolos opened 7 years ago
Hello @pnomolos, thanks for the suggestion!
The main/only downside of this approach is that we lose granularity, since time
calls are aggregated as an histogram in dogstatsd (with avg, median, percentiles), so we would send pre-aggregated data to dogstatsd.
Would the batch
function work for you? It should greatly reduce the number of push to statsd while keeping the full data.
What do you think?
@degemer Unfortunately that isn't viable in my use-case. In the example above, process_accounts(accounts)
is a call out to a third-party library, but the processing time should scale linearly with the number of accounts.
In my case I'm running about 600 of these jobs per day (with varying numbers of accounts per job) and I'm trying to get a baseline for the time it takes to do the jobs, normalized by the number of accounts that are being processed.
If I was able to add a call per-account I definitely would, but that's not the case here :(
I have to time some bulk operations and it's not viable to push to statsd every iteration of the loop. However, what would be nice is something like the following:
Internally this would change
time
to something similar to the following:I can open a PR if this sounds like a good idea :)