line / armeria

Your go-to microservice framework for any situation, from the creator of Netty et al. You can build any type of microservice leveraging your favorite technologies, including gRPC, Thrift, Kotlin, Retrofit, Reactive Streams, Spring Boot and Dropwizard.
https://armeria.dev
Apache License 2.0
4.83k stars 919 forks source link

RequestTimeout is 503 with ThrottlingStrategy #5637

Open dlvenable opened 6 months ago

dlvenable commented 6 months ago

This is a follow-on from #5572. I have found that adding a ThrottlingService decorator is causing the 408 to be replaced by a 503.

You can see this happening in my updated sample. In particular this line is what is resulting in a 503.

.decorator(ThrottlingService.newDecorator(ThrottlingStrategy.rateLimiting(10.0), new CustomThrottlingRejectHandler()))

Interestingly, adding an arbitrary Decorator doesn't have this same result. For example, adding the following line (without the ThrottlingService) still has the desired 408.

.decorator("/test", SimpleDecorator.newDecorator())

One other observation I have is that there is a correspondence between getting exception logging and the 503.

When I have the ThrottlingStrategy and get the 503, I see these logs:

[armeria-common-worker-kqueue-3-2] WARN com.linecorp.armeria.server.DefaultUnhandledExceptionsReporter - Observed 1 exception(s) that didn't reach a LoggingService in the last 10000ms(10000000000ns). Please consider adding a LoggingService as the outermost decorator to get detailed error logs. One of the thrown exceptions:
com.linecorp.armeria.server.RequestTimeoutException
    at com.linecorp.armeria.server.RequestTimeoutException.get(RequestTimeoutException.java:36)
    at com.linecorp.armeria.internal.common.DefaultCancellationScheduler.invokeTask(DefaultCancellationScheduler.java:512)
    at com.linecorp.armeria.internal.common.DefaultCancellationScheduler.lambda$start$2(DefaultCancellationScheduler.java:148)

When I don't have the ThrottlingStrategy I do not see those logs.

minwoox commented 6 months ago

If you enable this line, which sets the server error handler, https://github.com/dlvenable/armeria-sample/blob/2bf911126d2610ab4284f0adbf177999e3ea59b3/src/main/java/io/venable/samples/armeria/request_timeout/RequestTimeoutExperiment.java#L60 408 is returned. Could you check that again, please? 🤔

dlvenable commented 6 months ago

@minwoox ,

Yes, I enabled that line and also added logging. I can see that it reaches the condition and returns the 408.

Ideally we would not be using a custom error handler to resolve this. It seems that the default error handler is not being used when the throttling strategy is enabled. Is that something you could do?

If we do have to enable a custom error handler, is there a way to get the default one so that we don't have to duplicate the conditions? It appears that the default error handler is package protected.

minwoox commented 6 months ago

Ideally we would not be using a custom error handler to resolve this.

That is correct. 👍

It seems that the default error handler is not being used when the throttling strategy is enabled.

I'm not 100% sure but it seems like it's being used. We haven't implemented returning 408 yet when the client doesn't set the request fully. https://github.com/line/armeria/blob/94151152f27df5c8ad61d7d3b840fc47ca2c0161/core/src/main/java/com/linecorp/armeria/server/DefaultServerErrorHandler.java#L88-L90

So I think the server returns 503 if you don't apply your custom error handler. Let me ping you when #5579 is done so that you don't have to apply your own error handler. 😉

dlvenable commented 5 months ago

@minwoox , I want to follow up to see when you think #5579 will be completed. This will help us resolve some issues on our end.

minwoox commented 5 months ago

@dlvenable Hi! It has already been addressed and included in 1.29.0. 😉 https://github.com/line/armeria/pull/5680

dlvenable commented 5 months ago

@minwoox , Thank you for working this. I have verified that this fix works with my sample for reproducing the bug.