DataDog / datadog-go

go dogstatsd client library for datadog
MIT License
351 stars 134 forks source link

Optimize getContext and getContextAndTags #253

Closed martin-sucha closed 2 years ago

martin-sucha commented 2 years ago

It is not necessary to do multiple allocations and copying, single pass is enough.

Screenshot from 2022-02-11 20-14-49 Screenshot from 2022-02-11 20-06-06

martin-sucha commented 2 years ago

I'm curious of your use case and what bring you to optimizing this part of the client.

We switched away from datadog-go several years ago to another statsd library because datadog-go did not have aggregation support at the time and was spending too much time sending packets. Now datadog-go has aggregation support, so I was checking if we can switch back to datadog-go as the other library does not support distribution metrics, which I'd like to use in some places. As part of that experiment, I profiled both versions.

As you can see in the image from the profiler in the original post, datadog-go was about 2.2% CPU time in the staging environment and getContext/getContextAndTags was majority of that. At the same time it was obvious from the flamegraph that the function can be optimized pretty easily. Now the profiler shows about 1.6% CPU for datadog-go.

I also tried Prometheus Go client (with counter vectors only) for comparison and that is around 1.4% CPU, so much closer now.

Would you mind sharing how many points per second you're sending, what type of metrics ...

In the staging environment where I tested this about 42k metrics per second in one pod before aggregation (as shown by datadog.dogstatsd.client.metrics_by_type metric):

Metric type Rate
counter 34k / second
timing 7k / second
histogram 800 / second
gauge 30 / second
set 0
distribution 0