spring-projects / spring-retry

2.15k stars 514 forks source link

`MetricsRetryListener` Feedback #463

Closed mlichtblau closed 5 days ago

mlichtblau commented 3 weeks ago

Hi, while upgrading to the most recent version of spring-retry I found the new MetricsRetryListener. I think they will be super useful, so I wanted to share some feedback how I think they could be improved. Feel free to let me know if you are interested in a contribution of any of the points. Or if you think my requests are too specific I can also implement my own RetryListener.

Name Prefix

We only allow meters with a certain prefix and filter out the rest. Allowing to customize the metric prefix would make this much easier. Micrometer for example allows specifying the namePrefix like this for many metrics.

Metric Cardinality Concerns

Many observability tools warn about using tags / labels for unbounded values. E.g. Prometheus:

Do not use labels to store dimensions with high cardinality (many different label values), such as user IDs, email addresses, or other unbounded sets of values.

The retry count is an example of such an unbounded value if there is no maxAttempts specfied on the retry template. I can add my own customTagsProvider for unbounded (or very high max attempts) retrytemplates that has tags for retry counts in buckets (1-10, 11-100, 100-1000, >1000). The cardinality is then bounded by the number of buckets. Similar for exceptions, I would like to only tag a set of exceptions and if the exception I would like to simply add an unknown_exception tag and thereby bound the number of exceptions again. However, I don't think it is currently possible to opt-out of the default tags retry.count and exception and only use the custom tags provider. My request would be to provide the existing tag providers, but allow either to opt-in or opt-out.

artembilan commented 3 weeks ago

Thank you for feedback, @mlichtblau !

I'm not very good with metrics, but here are some my naive answers to your concerns:

Micrometer for example allows specifying the namePrefix like this for many metrics.

Doesn't look like that it is consistent with many other metrics: https://github.com/micrometer-metrics/micrometer/blob/main/micrometer-core/src/main/java/io/micrometer/core/instrument/binder/kafka/KafkaMetrics.java#L65 https://github.com/micrometer-metrics/micrometer/blob/main/micrometer-core/src/main/java/io/micrometer/core/instrument/binder/commonspool2/CommonsObjectPool2Metrics.java#L52

Plus all those new based on the Observation, does not come with such a prefix customization: https://github.com/spring-projects/spring-kafka/blob/main/spring-kafka/src/main/java/org/springframework/kafka/support/micrometer/KafkaListenerObservationConvention.java#L38

Probably there is the way to customize names from a central point like MeterRegistry, like the mentioned MeterFilter.

Many observability tools warn about using tags / labels for unbounded values.

If I understood correctly that is about long, while our .and("retry.count", "" + context.getRetryCount()) is maxed to Integer.MAX_VALUE. Not sure, though, how is that practical to have such a forever retry in the target projects...

I would like to only tag a set of exceptions and if the exception I would like to simply add an unknown_exception tag and thereby bound the number of exceptions again.

That is probably again can be done on the MeterRegistry via MeterFilter:

    /**
     * Rename a tag key for every metric beginning with a given prefix.
     * @param meterNamePrefix Apply filter to metrics that begin with this name.
     * @param fromTagKey Rename tags matching this key.
     * @param toTagKey Rename to this key.
     * @return A tag-renaming filter.
     */
    static MeterFilter renameTag(String meterNamePrefix, String fromTagKey, String toTagKey) {

...

    /**
     * Suppress tags with given tag keys.
     * @param tagKeys Keys of tags that should be suppressed.
     * @return A tag-suppressing filter.
     */
    static MeterFilter ignoreTags(String... tagKeys) {

That's my best understanding what is going on.

We can ask @jonatan-ivanov for further insight.

Thanks again.

jonatan-ivanov commented 3 weeks ago

We only allow meters with a certain prefix and filter out the rest. Allowing to customize the metric prefix would make this much easier. Micrometer for example allows specifying the namePrefix like this for many metrics.

As Artem mentioned above, prefix is not a consistently widespread thing in Micrometer instrumentation and we might have more instrumentation without the ability of providing a prefix than with it. I'm not against adding it to here but Micrometer allows modifying the name and the tags of a Meter using a MeterFilter so in this case you can utilize it as well. I assume you are already using it for other metrics that does not let you setting the prefix.

The retry count is an example of such an unbounded value [...]

We discussed this before with Artem, you can detect if this is causing an issue using Micrometer's HighCardinalityTagsDetector and you can remove the tag using a MeterFilter.

Similar for exceptions, I would like to only tag a set of exceptions and if the exception I would like to simply add an unknown_exception tag and thereby bound the number of exceptions again. However, I don't think it is currently possible to opt-out of the default tags retry.count and exception and only use the custom tags provider.

It is totally possible to remove the retry count tag and transform the existing exception tag (leave what you are interested in and map to unknown what you are not) using a MeterFilter, please see the docs: https://docs.micrometer.io/micrometer/reference/concepts/meter-filters.html

artembilan commented 5 days ago

Closed as Won't Fix in favor of a MeterFilter API from Micrometer.