Netflix-Skunkworks / spectatord

A high performance metrics daemon
Apache License 2.0
24 stars 5 forks source link

fix monotonic counter implementations #90

Closed copperlight closed 3 months ago

copperlight commented 3 months ago

This change fixes a couple of assumptions that were made in #87 and #88. These changes were made to address a usage issue for monotonic counters in the spectator-go thin client library. For that library, a common use case is to sample uint64 monotonic data sources, so the data type for that meter type in the library was set to uint64. This lead to a need to cascade that change into spectatord.

However, changing the existing data type for C from double to uint64_t caused the monotonic counter usage in atlas-system-agent to start partially failing on parsing numbers. When we made these changes, we overlooked the fact that the double data type is mostly used when monotonic data sources need to be converted into base units for recording values. So, a monotonic data source recording microsecond values would have to be divided by 1000, and this results in a fractional number, which is not parsed by strtoull.

Example error message:

While parsing sys.pressure.some,xatlas.process=atlas-system-agent,id=io:117.094624:
Got 117 parsing value, ignoring chars starting at .094624

The correct way to support all of these use cases, and to preserve backwards compatibility, is to introduce a new meter type MonotonicCounterUint which supports the uint64_t data type and handles rollovers, while reverting the changes to MonotonicCounters (C) and MonotonicSampled (X). In reverting the changes, we kept the original implemenation of disallowing negative value updates, because there is not a use case where that should happen.

We chose not to create a MonotonicSampledUint at this point in time, because this is an experimental meter type and there is not currently any demand for it.

This strategy mirrors the spectator-java implementation:

This change updates the admin server /metrics endpoint, so that the new meter type will be reflected in the output. A stats block is also added to the payload to help make it easier to get a sense of the number of meters that are in use.

copperlight commented 3 months ago

Can you update the readme to include the U type?

That change is covered in this PR.

image