resilience4j / resilience4j

Resilience4j is a fault tolerance library designed for Java8 and functional programming
Apache License 2.0
9.68k stars 1.33k forks source link

Rate limiting implementation for reactor is broken #311

Closed thekalinga closed 5 years ago

thekalinga commented 5 years ago

Please ignore this completey, Just look at below comment as it makes more sense than this

Rate limiter is supposed to protect a resource against unexpected number of calls like lets say number of times an external API can called/sec

Meaning, we guard the actual calls, not the results from external call

In our current Reactor implementation (I assume the same is true for RxJava2 aswell), we are rate limiting the downstream results, not the actual upstream subscription (Refer to MonoCallable#subscribe)

Rate limiter should limit the number of time a method is called, so, for Mono.fromCallable the upstream subscription should be delayed (using delaySubscription) so that the actual method under rate limit never gets executed till a rate limit token is available

In the current implementation

~~ResilienceBaseSubscriber#onSubscribe - is a blind request to upstream without any rate limit RateLimiterSubscriber#hookOnNext - we are rate limiting downstream elements aswell, which is wrong completely, as we should never care about how many elements upstream is producing, we are interested only in rate limiting the upstream calls~~

For rate limiting downstream elements within a Flux, we have Flux.limitRate & Flux.limitRequest

So the current limiter is completely broken

thekalinga commented 5 years ago
  1. The current implementation of RateLimiterSubscriber#isCallPermitted is a blocking operation. So all the calls(+ threads) if any of the threads use the same rate limiter will be blocked (blocking thread is not a good idea to begin with) when rate limiter does not have enough tokens to issue
  2. Also, it ties downstream events rate limit with upstream subscription tho user never opted-in for both & all usecases in synchronous world are about just limiting upstream subscription, so, its odd that the same limiter attempts to limit both upstream subscription & downstream elements simultaneously

There are a total of three aspects to rate limit in reactive streams in general

  1. Rate limiting upstream subscription (applicable both to Flux & Mono) [so that actual callable upstream will not be invoked till token is available. This is the flow non reactive invocations cover]
  2. Rate limiting upstream requests (applicable only to Flux) [so that we dont overwhelm upstream if downstream subscriber is not a good citizen of reactive world. But is this even required as we already have Flux.limitRequest & Flux.limitRate?. The only usecase I can think of is sharing a single ratelimiter in multiple places, but I'm not sure how useful this will be]
  3. Rate limiting downstream elements (applicable only to Flux) [so that we dont overwhelm downstream with too many elements. The only usecase I can think of is sharing a single ratelimiter in multiple places, but I'm not sure how useful this will be. This looks like something that probably no one needs as a well behaving subscriber is expected to ask only the amount of data he can handle]

And each of them should be opt-in. Currently (2) is not at all considered as part of rate limiting, (1) & (3) come in a single package (they are not opt-in, but combined), that too by blocking thread

So, the current implementation handles none of the three scenarios correctly

storozhukBM commented 5 years ago

Hi @thekalinga, Thank you for bringing up this conversation. I think we should wait for @madgnome for comment. If I get your ideas right it should be possible to construct your solution on top of Metrics.getAvailablePermissions and RateLimiter. reservePermission

Any how if you want to create PR and demonstrate benefits of your solution, it is more than welcome.

thekalinga commented 5 years ago

@storozhukBM I'm attempting to modify exiting source to see if I can make the changes. But I've couple of hurdles

  1. I'm not quite certain about why some hooks always check for wasPermitGranted
  2. There is a lot of code is common across rate limiter, bulkhead & circuit breaker

I'm attempting to modify it now. Will share an update if I make any progress

storozhukBM commented 5 years ago

Sure, thanks.

madgnome commented 5 years ago

It seems that you are mixing 2 different concerns in the same issue and labelling everything broken.

Blocking threads

As you have mentioned RateLimiterSubscriber#isCallPermitted is a blocking operation.

This is not the most efficient solution but this is not wrong and it is compliant with the reactive stream specification since publishers are not blocked (thanks to publishOn() usage).

When I implemented the reactor API, non blocking API for rate limiter was not available. Now that RateLimiter#reservePermission is available we may optimize the current solution. I will have a look.

API

The current reactor implementation follows rxjava implementation, it was implemented this way at the time for consistency with the existing API.

As you have said rate limiting can happen in multiple places: 1) subscription (flux and mono) 2) upstream requests 3) elements flowing downstream (for flux)

The current implementation does 1 and 3.

callRemoteService(request)
   .compose(RateLimiterOperator.of(rateLimiter))

callRemoteService will only be subscribed to if permits are available and elements will only flow if permits are available.

You can achieve something similar to 2 by modifying the chain of operators:

Flux.just(request1, request2...)
    .compose(RateLimiterOperator.of(rateLimiter))
    .flatMap(r -> callRemoteService(r))

Here callRemoteService will be rate limited.

The current operator makes more sense with Mono than it does on Flux since we would only check permit once on subscription.

For Flux we could change the implementation (or introduce a new operator) to instead rate limit the requests.

Do you have example of usecases where you want to ratelimit? that would help design a better api.

RobWin commented 5 years ago

How do we proceed?

madgnome commented 5 years ago

I think we should modify rateLimit to rate limit on the subscription and change the implementation to be non blocking using reservePermission

I can raise a PR for it if we agree on the plan.

RobWin commented 5 years ago

Yes, a PR would be awesome

RobWin commented 5 years ago

I totally agree to what @thekalinga said.

I fixed the Reactor and RxJava implementation. Will be solved in 0.15.0

The new RateLimiterOperator will only limit upstream subscription and not downstream events.