spring-projects / spring-boot

Spring Boot helps you to create Spring-powered, production-grade applications and services with absolute minimum fuss.
https://spring.io/projects/spring-boot
Apache License 2.0
75.08k stars 40.67k forks source link

Add AutoConfiguration to Spring Kafka's Retry Topic Feature #28450

Closed tomazfernandes closed 2 years ago

tomazfernandes commented 3 years ago

This is a proposal for enabling auto configuration for the delayed non-blocking retries using topics feature that we introduced earlier this year on Spring for Apache Kafka 2.7.0.

More details on the feature can be found in the documentation: https://docs.spring.io/spring-kafka/docs/current/reference/html/#retry-topic

I already have a working draft with tests and should open a PR shortly so we can discuss it in more details if necessary.

The minimal configuration would be as simple as adding spring.kafka.retry-topic-enabled=true to the application yaml or properties configurations file, which would apply a default global retry configuration to all topics.

spring:
  kafka:
    retry-topic-enabled: true

With that minimal configuration, the framework would set up topics and consumers with the following defaults:

More specific configurations could also be provided in the configuration files, such as:

spring:
  kafka:
    retry-topic-enabled: true
    retry-topic:
      myGlobalConfiguration:
        attempts: 4
        back-off:
          delay: 300s
          multiplier: 2
          max-delay: 1200s
        topic-suffixing-strategy: suffix-with-index-value
        dlt-strategy: fail-on-error

Optionally, more than one configuration could be provided for more fine-grained control over the configuration:

spring:
  kafka:
    retry-topic-enabled: true
    retry-topic:
      myGlobalConfiguration:
        attempts: 4
        back-off:
          delay: 300s
          multiplier: 2
          max-delay: 1200s
        topic-suffixing-strategy: suffix-with-index-value
        dlt-strategy: fail-on-error
        exclude-topics: my-specific-topic
     mySpecificConfiguration:
        include-topics: my-specific-topic
        attempts: 5
        back-off:
          delay: 5s
        not-retry-on:
          - com.mycompany.myproject.mypackage.exceptions.MyException
          - com.mycompany.myproject.mypackage.exceptions.MyOtherException
        topic-suffixing-strategy: suffix-with-index-value
        dlt-strategy: fail-on-error

Note: If it gets too verbose for many topic configurations the user should be able to provide that in one or more separate configuration files using standard spring boot configuration features such as spring.config.import.

I'd appreciate any feedback you may have on this. Also please let me know if there are any questions or anything I can help with.

@garyrussell and @artembilan, I'd much appreciate your feedback too if and when possible.

Thank you all for the hard work you put into the awesome Spring projects!

garyrussell commented 3 years ago

@tomazfernandes I concur that it is time to consider adding support for auto configuration via Spring Boot; the feature is proving to be quite popular and is gaining traction in the community (based on feedback received).

It might be helpful to the Boot team if you can show what would be a minimal configuration (with sensible defaults) since the above is a bit overwhelming at first sight.

It may be too late to get this into Boot 2.6, but I will defer to the Boot team for that decision.

tomazfernandes commented 3 years ago

@garyrussell thanks for the feedback and support as always. I've updated the description making the minimal configuration clearer - it should be as simple as adding spring.kafka.retry-topic-enabled=true to the application yaml or properties configurations file.

The bigger yaml would work for more specific configurations - I can provide smaller examples too if it would help understand the feature.

I've added the defaults to the description.

artembilan commented 3 years ago

IMHO that YAML is too big and too complex to be considered as a plain conventional config for microservice. The goal of Spring Boot it to give us as much as possible in default shape with slight external configuration for those statically exposed beans. What your code shows is something like YAML-programming which is not so attractive as some Customizer pattern impl in the target project to supply auto-configuration.

It might be OK. but probably without those grouping: let keep it simple and microservice compatible!

Thank you for your great effort anyway, @tomazfernandes !

tomazfernandes commented 3 years ago

Hi @artembilan, thanks a lot for your feedback! I see your point, and it's definitely worth looking into ways to slim the yaml down and provide more OOTB.

The example I provided on the description is a purposely complex configuration to show some of the possibilities we might have. I looked into a few users retry topics' configurations from issues opened in Spring Kafka's Github and Stack Overflow, so we can see something closer to real life examples and how would their particular YAML configurations be using this approach.

From https://github.com/spring-projects/spring-kafka/discussions/1921

spring:
  kafka:
    retry-topic-enabled: true
    retry-topic:
      globalConfiguration:
        attempts: 4
        back-off:
          delay: 300s
          multiplier: 2
          max-delay: 1200s
        topic-suffixing-strategy: suffix-with-index-value
        dlt-strategy: fail-on-error

From https://github.com/spring-projects/spring-kafka/issues/1863

spring:
  kafka:
    retry-topic-enabled: true
    retry-topic:
      globalConfiguration:
        attempts: 2
        back-off:
          delay: 1s
          multiplier: 2
        topic-suffixing-strategy: suffix-with-index-value
        fixed-delay-topic-strategy: single-topic
        listener-container-factory: retryKafkaListenerContainerFactory

From https://stackoverflow.com/questions/69571401/dlt-message-consumption

spring:
  kafka:
    retry-topic-enabled: true
    retry-topic:
      globalConfiguration:
        attempts: 5
        dlt-handler-class: com.mycompany.myproject.mypackage.MyClass
        dlt-handler-method: xyz
        back-off:
          delay: 1s
          multiplier: 2
          max-delay: 3s
        not-retry-on:
          - com.mycompany.myproject.mypackage.exceptions.MyException
          - com.mycompany.myproject.mypackage.exceptions.MyOtherException
        listener-container-factory: kafkaListenerContainerFactory
        topic-suffixing-strategy: suffix-with-index-value

I'm not sure what's an acceptable size of a yaml configuration in boot's criteria - these examples seem reasonable to me at first, and I surely have configuration files larger than this, e.g. for external services or even Spring Kafka itself with ssl, schema registry, topics, consumers, producers, etc. And of course, the more complex and fine-grained the configuration, the bigger the configuration file will be.

What it seems to me is most of the times the users' Retry Topics' configuration should be externalized in a configuration file anyway - either on application.yml files in the project itself or using Spring Cloud Config for example.

So currently users externalizing the configuration have to do all the mapping and instantiations themselves, maybe through @Value annotations on @Configuration classes for RetryTopicConfiguration beans, or for each topic via @RetryableTopic annotations with SpEL. Maybe this might be able to help them out by only having to write the configuration once and not worrying with anything else.

tomazfernandes commented 3 years ago

I've updated the description with (hopefully) less overwhelming examples.

wilkinsona commented 3 years ago

Thanks for raising this, @tomazfernandes.

It may be too late to get this into Boot 2.6, but I will defer to the Boot team for that decision

Yeah, having reached RC, it's too late to add new features, particularly one that's as potentially complex as this one. It feels like something that ought to go in an early milestone so that it can evolve based on user feedback.

Looking at some of the YAML examples above, it appears that what's being proposed would map an arbitrary amount of YAML to an arbitrary number of RetryTopicConfiguration beans. While that's possible, it's something that we generally try to avoid in Boot.

There are some parallels here with the support for auto-configuring OAuth 2 clients. Something that I think is different, though, is that, no matter how many clients are defined in the user's configuration, only a single bean is defined:

https://github.com/spring-projects/spring-boot/blob/9cb5f035e79942baf7efeff0a23686a6ccdc6a61/spring-boot-project/spring-boot-autoconfigure/src/main/java/org/springframework/boot/autoconfigure/security/oauth2/client/servlet/OAuth2ClientRegistrationRepositoryConfiguration.java#L45-L51

If we want to pursue auto-configuration for this, and assuming that it isn't already possible, I think it would be good if retry topics could be configured in Spring Kafka in a way that isn't so bean-heavy.

tomazfernandes commented 3 years ago

Hello @wilkinsona, thanks a lot for the feedback!

Yeah, having reached RC, it's too late to add new features, particularly one that's as potentially complex as this one. It feels like something that ought to go in an early milestone so that it can evolve based on user feedback.

I agree, changing this kind of contract after it's been released is much more complex and we have no need to hurry. Though I'll admit I was a little anxious about this yesterday 😄

Looking at some of the YAML examples above, it appears that what's being proposed would map an arbitrary amount of YAML to an arbitrary number of RetryTopicConfiguration beans. While that's possible, it's something that we generally try to avoid in Boot.

You're right about having each section of the YAML mapped to a RetryTopicConfiguration instance - looks like a simple and effective strategy, coding-wise. Even though @artembilan brought some good points regarding the contract itself that I'm looking into.

And I do see your point, makes total sense. From the framework's point of view, if every feature depends on instantiating an unlimited number of beans it might quickly become a huge mess. I guess this is at least partially enforced by the lack of ways to instantiate a List of beans through the regular @Bean methods.

What I had thought for that is creating an aggregation class for these instances in Spring Kafka, so that we (and the users) could add a number of RetryTopicConfiguration instances to a single aggregating instance, that could be registered through a regular @Bean method. How does that sound?

Also, I don't really expect users creating a huge amount of configurations per app, specially if we're talking about microservices that might not even consume from so many topics in the first place. But if this strategy of letting users define a bean per configuration in Spring Kafka as is today looks like a problem in the long run, we could look into deprecating the current approach and relying only on the single aggregating bean, if that's the way we're going.

wilkinsona commented 3 years ago

What I had thought for that is creating an aggregation class for these instances in Spring Kafka … that could be registered through a regular @Bean method. How does that sound?

It sounds good. That would match the approach taken in Spring Security's OAuth support that I referenced above. The aggregation class would be analogous to InMemoryClientRegistrationRepository which is consumed as a bean.

An alternative approach would be something like Spring MVC's WebMvcConfigurer with a method that's called with some sort of retry topic registry with which the configuration's can be registered. This would be analogous to MVC's ResourceHandlerRegistry, for example.

I'm not sufficiently familiar with Spring Kafka's programming model to know which of these would be more aligned with the rest of its API, but I think either would be preferable to defining multiple RetryTopicConfiguration beans, both from the perspective of Spring Boot's auto-configuration and of users who are configuring things themselves programatically.

tomazfernandes commented 3 years ago

I'm not sufficiently familiar with Spring Kafka's programming model to know which of these would be more aligned with the rest of its API, but I think either would be preferable to defining multiple RetryTopicConfiguration beans, both from the perspective of Spring Boot's auto-configuration and of users who are configuring things themselves programatically.

Spring Kafka uses the Configurer / Registrar / Registry approach for processing the @KafkaListener annotations, and then registering the endpoints and creating the containers. The other way you can create and configure listeners is by manually instantiating either the container or a ListenerContainerFactory.

If I understand correctly, in this pattern the Registry does the actual processing, with the Configurer using a Visitor pattern to apply customizations to the Registry before processing begins.

For the RetryTopic setup, when the KafkaListenerAnnotationBeanPostProcessor is processing an @KafkaListener bean, the RetryTopicConfigurationProvider tries to provide a configuration instance for the topics currently being processed, either from the RTC beans registered in the application context or from a @RetryableTopic annotation in the @KafkaListener annotated method. If a configuration for the given topics is found, it will then be processed by a different class that orchestrates the retry and dlt endpoints registration.

Considering the RT configuration processing itself is already being handled by a separate class that expects to receive the proper RTC instance, it might make more sense to go with the repository approach, having a repository bean that would be looked up by the RTCProvider when attempting to provide the configuration. Makes sense?

Following the pattern in ClientConfigurationRepository, we might have:

public interface RetryTopicConfigurationRepository {

    @Nullable RetryTopicConfiguration findByTopicNames(Collection<String> topics);

}
tomazfernandes commented 3 years ago

Actually, I see there are cases in which the registry seems to have a more active role (KafkaListenerEndpointRegistry, LoggingMeterRegistry) and cases when it has more of a passive role (InterceptorRegistry, ResourceHandlerRegistry). I'm not really familiar with these configuration approaches, I'll keep looking into them.

tomazfernandes commented 2 years ago

Thanks a lot @wilkinsona for the examples, I've drafted how the solution would be with each of them and they both lead to solid designs!

After considering what's been presented, I'd like to propose a different approach for the auto configuration. If we decide to proceed with this, we can extend this approach to users who prefer to do it programmatically in Spring Kafka, which would also help to keep the bulk of the configuration logic there.

As @garyrussell suggested, I'll start with the simplest example and work my way up from there.

The simplest one would still be:

spring:
  kafka:
    retry-topic-enabled: true

This would set up the feature with sensible defaults that would be applied to all topics. We can discuss those defaults later if the current ones are not ideal.

The user would be able to override the defaults with, for example:

spring:
    kafka:
        retry-topic-enabled: true
        attempts: 5
        back-off:
            delay: 1s
            multiplier: 2
            max-delay: 3s

Gone are the groups as @artembilan suggested - these overrides would apply to all topics.

If the user wants to override configurations for specific topics, we could have:

spring:
    kafka:
        retry-topic-enabled: true
        attempts: 5
        back-off:
            delay: 1s
            multiplier: 2
            max-delay: 3s
        topic-overrides:
            my-specific-topic:
                dlt:
                    class: com.mycompany.myproject.mypackage.MyDltHandlerClass
                    method: myDltMethod
            my-specific-topics:
                matching: *-specific-topics-suffix   
                back-off:
                    delay: 2s

This way I think we get a smaller and more intuitive contract.

How do you feel about that, does it look like a good fit for auto configuration?

Thanks!

wilkinsona commented 2 years ago

Yes, I think that looks nice and concise now, particularly for cases where there aren't any topic-specific overrides.

snicoll commented 2 years ago

I personally don't think we should add the "topic-overrides" part of the configuration. Specifying classes and method in configuration properties is something we try to avoid generally. @tomazfernandes would you have the time to provide a PR for the first part of the config?

tomazfernandes commented 2 years ago

Hi @snicoll, thanks for looking into this.

I personally don't think we should add the "topic-overrides" part of the configuration. Specifying classes and method in configuration properties is something we try to avoid generally.

Ok, sure. What I understand from this conversation so far is that Spring Boot's auto configuration is focused on configuring the framework's components and letting the user override some sensible defaults, rather than letting users make specific configurations, such as for a specific topic or group of topics.

For those, users will still need to manually wire any externalized configurations to either @Configuration classes or the @KafkaListener annotations. I initially thought it'd be a good thing to have auto configuration handle all the wiring, but now I see that it might easily become unnecessarily messy.

Is my understanding correct?

If so my question is, for this feature, besides adding a global configuration with the given properties, do we want auto configuration to override the defaults for any other retry topic configurations users might add?

Maybe @garyrussell can chime in on this one?

@tomazfernandes would you have the time to provide a PR for the first part of the config?

Yes, sure, I'll have the time, and will start looking into this right away, thanks. I really appreciate you all taking the time to look into this.

garyrussell commented 2 years ago

I generally concur with @wilkinsona and @snicoll 's comments; let's just go with basic config (enabled, attempts, backoff and maybe, optionally, adding exception type classification - retryable and not).

For more advanced configuration, it would make more sense for the user to configure his/her own builder class to create the configuration, because of all the content assist niceness provided by the IDE. https://docs.spring.io/spring-kafka/docs/current/reference/html/#using-retrytopicconfiguration-beans

Once we have basic configuration available, we can wait for user feedback and consider additions case by case.

tomazfernandes commented 2 years ago

Sure @garyrussell, makes total sense. Thanks a lot for your feedback. With all that's been said I think got a better understanding of what auto configuration is supposed to do, and what it's not. @artembilan's first comment on YAML programming is clearer now too.

Seems like I have enough to start working on this. I should have a PR ready shortly, if anyone has any other concerns or suggestions please let me know.

Thanks!

snicoll commented 2 years ago

Closing in favor of PR #29812