spring-cloud / spring-cloud-gateway

An API Gateway built on Spring Framework and Spring Boot providing routing and more.
http://cloud.spring.io
Apache License 2.0
4.55k stars 3.33k forks source link

Spring Cloud Gateway blocking requests for route discovery #1104

Closed lucasoares closed 4 years ago

lucasoares commented 5 years ago

Hello.

I'm using Spring Cloud Gateway from spring-cloud-starter-gateway version 2.1.0.RELEASE and I need to understand why Gateway is blocking requests to perform the DiscoveryClientRouteDefinitionLocator process.

Spring Cloud Version: Greenwich.RELEASE.

I have two environments: staging and production.

In production we have a working gateway with the following latency for /actuator/health call:

image

I was investigatinng why those spikes occurs on a simple health call and I figure out the gateway blocks any requests sometimes (even health or real microservices call) to perform discovery routes of all my microservices.

We use Consul for discovery server and I tried to test this latency at my staging environment (with way less hardware resource on Consul). The impact of this block is clear:

image

After improving the Consul hardware resources we have no more spikes but the latency still is not perfect (and have minor spikes to discovery all routes) for a health call:

image

I need to ask: why Spring Cloud Gateway is blocking requests even having caching feature? Should not this process run in the background? What I'm doing wrong? Its really an issue with Spring Cloud Gateway?

Thank you.

ryanjbaxter commented 5 years ago

Yes this is an issue. What we need is a reactive DiscoveryClient implementation and we don't have one right now. There is an issue open for this in Spring Cloud Commons https://github.com/spring-cloud/spring-cloud-commons/issues/557.

We are going to work to provide this in the next Greenwich SR.

ryanjbaxter commented 5 years ago

Looks like that we wont be getting this done until Greenwich.SR4 at the earliest.

lucasoares commented 5 years ago

Sad to hear that, but ok!

Then for me Spring Cloud Gateway isn't ready for production environments with heavy loads. I will wait for more info about that..

Thank you

TYsewyn commented 5 years ago

Reopening for Greenwich.SR4

lucasoares commented 5 years ago

Hello guys..

What version should I try to check if my Gateway can receive requests without being interrupted by route definition locator?

Thank you all.

TYsewyn commented 5 years ago

Hi @lucasoares. You should be able to test this out with Hoxton.M3. Please be aware of this issue: spring-cloud/spring-cloud-commons#617. A temporary workaround has been proposed in this comment: https://github.com/spring-cloud/spring-cloud-commons/issues/617#issuecomment-538987899

lucasoares commented 5 years ago

@TYsewyn The reactive discovery client is exaclty what I need..

I have an uptime of 70% (calling /health every 10 seconds) because the gateway sometimes take more than 10 seconds to answer any call (routed or not).

With this problem all my microservices are getting request timeout because of the gateway delay. This issue will solve this problem?

Do you know any workaround for my specific problem? Or following the workaround in spring-cloud/spring-cloud-commons#617 (comment) will do the job?

ryanjbaxter commented 5 years ago

I am not sure a reactive discovery client is going to solve the slow response time. Have you looked into why the responses are taking so long?

lucasoares commented 5 years ago

If you look into my issue description you will see that service discovery are blocking requests to perform the discovery process. I found it out because upgrading my Consul instance reduced all requests time and my application was logging the service discovery process before routing the requests.

Even the /health actuator request are activating the discovery client and the gateway only routes a request or answer the health call after discovering all services. My cluster has lots of microservices.

There is two option:

I have one modification on my Spring Cloud Gateway to duplicate all routes definition:

All my services are registered in Consul with a version metadata to be used for routing.

If my microservice are named auth-service and the version registered are staging, then my gateway will create two extra route definitions:

The result will be three routes and filters:

This is needed on my use case.

Here is my implementation of this:

  /**
   * This bean overrides the default DiscoveryClientRouteDefinitionLocator to create a route base on
   * the 'version' and 'service_name' metadata on each service but this also keeps the initial
   * routes adding a 'service_id' argument on the {@link PredicateDefinition}.
   *
   * @param discoveryClient the discovery client.
   * @param discoveryLocatorProperties the discovery locator properties
   * @return a override class to generate all paths.
   */
  @Bean
  public DiscoveryClientRouteDefinitionLocator clientRouteDefinitionLocator(
      DiscoveryClient discoveryClient, DiscoveryLocatorProperties discoveryLocatorProperties) {

    addPredicates(discoveryLocatorProperties);

    addGatewayFilters(discoveryLocatorProperties);

    return new DiscoveryClientRouteDefinitionLocator(discoveryClient, discoveryLocatorProperties) {
      @Override
      public Flux<RouteDefinition> getRouteDefinitions() {
        return super.getRouteDefinitions()
            .flatMap(
                route -> {
                  // Each route will be translated to two more routes.
                  RouteDefinition fullServiceRoute = duplicate(route, FULL_SERVICE_KEY);
                  RouteDefinition serviceVersionRoute = duplicate(route, SERVICE_VERSION);

                  if (LOGGER.isDebugEnabled()) {
                    LOGGER.debug(
                        "Route registered {}:\n{}\n{}\n{} ",
                        route.getUri(),
                        getRouteDescription(route),
                        getRouteDescription(fullServiceRoute),
                        getRouteDescription(serviceVersionRoute));
                  }

                  return Flux.fromStream(Stream.of(route, fullServiceRoute, serviceVersionRoute));
                });
      }
    };
  }
TYsewyn commented 5 years ago

@ryanjbaxter I can confirm there is/was an issue with the route locator in both Greenwich.SR3 and Hoxton.M2. The blocking service discovery is being used in a reactive wrapper, but the call is still blocking. This would have been fixed with Hoxton.M3 using the reactive discovery client if we didn't have that issue with the composite client. This still needs to be fixed for Greenwich.SR4. @lucasoares Like I already said, try this with Hoxton.M3 but use the temporary workaround as mentioned in the comment I referred to.