micrometer-metrics / micrometer

An application observability facade for the most popular observability tools. Think SLF4J, but for observability.
https://micrometer.io
Apache License 2.0
4.49k stars 994 forks source link

Log delta count in addition to throughput in LoggingMeterRegistry #5548

Open fstaudt opened 1 month ago

fstaudt commented 1 month ago

Please describe the feature request. In LoggingMeterRegistry, instruments such as counter, timer and histogram are logged only with a throughput (e.g. throughput=0.016667/s).\ I propose to add the count next to the throughput in the logs for those instruments (e.g. throughput=0.016667/s count=1).

Rationale For instruments such as counter, timer and histogram, throughput is not always the most useful information to log.\ In some cases, users want to know the count as it is exported by other meter registries (e.g. OTLP).

Deriving the count from the throughput requires to know the step and it is not always obvious from logs (especially when logInactive is disabled).\ Adding the count in the logs next to the throughput would make the logs more clear.

Additional context N/A

I can provide a PR if it is OK for you.

shakuzen commented 1 month ago

In some cases, users want to know the count as it is exported by other meter registries (e.g. OTLP).

The count exported by other registries depends on what is expected with that registry. Some use a cumulative count, some use a delta count, some report each increment, some might use a step-normalized throughput. Since LoggingMeterRegistry is a StepMeterRegistry its meters are step meters, which for count means delta counts. But if you're trying to compare it with PrometheusMeterRegistry, for example, that is going to have cumulative counts instead. If we made it clear the logged count is a step count (delta count), that might help alleviate confusion, but it would still be hard to meaningfully compare it with other registries that aren't also a normal StepMeterRegistry. Ultimately, I'm not sure using the LoggingMeterRegistry as a means to tell what is exported by other registries is the best approach. Some registries themselves have logging that can be enabled to log the payload sent to the backend, if that's the level of debugging desired. I guess it might be helpful to understand more the intended use case here for the LoggingMeterRegistry. That said, I'm not opposed to showing a count, but we would want to take the above into consideration and not encourage people to use the LoggingMeterRegistry for some purpose it isn't fit for.

fstaudt commented 1 month ago

Ultimately, I'm not sure using the LoggingMeterRegistry as a means to tell what is exported by other registries is the best approach. Some registries themselves have logging that can be enabled to log the payload sent to the backend, if that's the level of debugging desired.

In our use case, we want to use:

We considered logging at the level of OpenTelemetry collector for DEV environments but we wanted to avoid extra network costs between applications and collector.\ Moreover, metrics logged in collector aggregate metrics from all applications in cluster and it makes it very complex for developers to find the metric they are interested in.

We don't plan to use LoggingMeterRegistry to compare metrics with other registries, I agree that it is better to have logging capabilities in each registry (if needed).\ Our goal is only to have similar numbers in both approaches.

If we made it clear the logged count is a step count (delta count), that might help alleviate confusion

I also agree that using delta_count=1 instead of just count=1 will make it more clear that the count displayed in logs is not cumulative.

shakuzen commented 1 month ago

Sounds reasonable to me. Would you like to make a pull request for it?

fstaudt commented 1 month ago

Sounds reasonable to me. Would you like to make a pull request for it?

Yes I will.

Thanks