google / guava

Google core libraries for Java
Apache License 2.0
50.18k stars 10.9k forks source link

RateLimiter.create called with zero warmup fails with real clock #2730

Open greggdonovan opened 7 years ago

greggdonovan commented 7 years ago

We recently hit an issue where calling RateLimiter.create(16.0, 0, TimeUnit.SECONDS) failed to rate limit after an upgrade from Guava 17 to 21. I tried to reproduce this using the FakeStopwatch but could only get it to reproduce with a real clock.

I'll be happy to submit this as PR if the change below is useful.

Bisecting tells me this changed in this commit.

bisect.sh

#!/usr/bin/env bash

function back() {
  cd ~/development/guava
  git co .
}
trap back EXIT

JAVA_HOME=/Library/Java/JavaVirtualMachines/jdk1.7.0_80.jdk/Contents/Home

# I can't get guava-gwt to compile :-(
sed -i '' 's_<module>guava-gwt</module>__' pom.xml

mvn clean install -Dmaven.test.skip=true || exit 125
cd guava-tests
{ mvn test -Dtest=com.google.common.util.concurrent.RateLimiterZeroWarmupTest && exit 0; } || exit 127

guava-tests/test/com/google/common/util/concurrent/RateLimiterZeroWarmupTest.java

package com.google.common.util.concurrent;

import java.util.concurrent.TimeUnit;
import junit.framework.TestCase;

public class RateLimiterZeroWarmupTest extends TestCase {

  public void testWarmUpWithZeroWarmupAndRealClock() {
    RateLimiter limiter = RateLimiter.create(16.0, 0, TimeUnit.SECONDS);
    int totalPermits = 1000;
    double totalSleep = 0;
    for (int i = 0; i < totalPermits; i++) {
      totalSleep += limiter.acquire();
    }
    assertFalse("totalSleep should be more than 60s, not 0", totalSleep == 0.0d); // <-- this fails
  }
}

Using the FakeStopwatch fails to reproduce:

diff --git a/guava-tests/test/com/google/common/util/concurrent/RateLimiterTest.java b/guava-tests/test/com/google/common/util/concurrent/RateLimiterTest.java
index b143bd94a..0c4de6c86 100644
--- a/guava-tests/test/com/google/common/util/concurrent/RateLimiterTest.java
+++ b/guava-tests/test/com/google/common/util/concurrent/RateLimiterTest.java
@@ -312,6 +312,14 @@ public class RateLimiterTest extends TestCase {
         "R0.20, R0.10, R0.10, R0.10"); // #7 (cont.), note, this matches #5
   }

+  public void testWarmUpWithZeroWarmup() {
+    RateLimiter rateLimiter = RateLimiter.create(stopwatch, 5.0, 0, TimeUnit.SECONDS, 3.0);
+    for (int i = 0; i < 5; i++) {
+      rateLimiter.acquire(); // #1
+      }
+      assertEvents("R0.00, R0.20, R0.20, R0.20, R0.20" /* #1 */); // <-- this works
+  }
+
   public void testBurstyAndUpdate() {
     RateLimiter rateLimiter = RateLimiter.create(stopwatch, 1.0);
     rateLimiter.acquire(1); // no wait

Guava workaround to use SmoothBursty when warmupSeconds == 0:

diff --git a/guava/src/com/google/common/util/concurrent/RateLimiter.java b/guava/src/com/google/common/util/concurrent/RateLimiter.java
index 945e2d168..cc99bbcbe 100644
--- a/guava/src/com/google/common/util/concurrent/RateLimiter.java
+++ b/guava/src/com/google/common/util/concurrent/RateLimiter.java
@@ -159,6 +159,9 @@ public abstract class RateLimiter {
    */
   public static RateLimiter create(double permitsPerSecond, long warmupPeriod, TimeUnit unit) {
     checkArgument(warmupPeriod >= 0, "warmupPeriod must not be negative: %s", warmupPeriod);
+    if (warmupPeriod == 0) {
+      return create(permitsPerSecond);
+    }
     return create(
         SleepingStopwatch.createFromSystemTimer(), permitsPerSecond, warmupPeriod, unit, 3.0);
   }
liufuyang commented 4 years ago

We encountered this same thing by setting warmupPeriod = 0, and the bad thing was we thought it was safe to change warmupPeriod = 5 to warmupPeriod = 0. But it actually make the rate limiter not limit any rate anymore, and created a huge CPU and memory cost in production. Would be nice to fix this in one way or another https://github.com/google/guava/pull/3724 @cpovirk @cpovirk

BirdHowl commented 4 years ago

I've observed this issue in the latest version of guava as well.

Running this snippet:

import com.google.common.util.concurrent.RateLimiter;
import java.util.Duration;
public class App {
    public static void main(String[] args) {
        final RateLimiter rateLimiter = RateLimiter.create(1.0, Duration.ZERO);
        while (true) {
            System.out.println("Loop start: " + System.currentTimeMillis());
            rateLimiter.acquire();
        }
    }
}

Gives me this type of output (the loop repeating several times within a single millisecond):

Loop start: 1585178916022
Loop start: 1585178916022
Loop start: 1585178916022
Loop start: 1585178916022
Loop start: 1585178916022
Loop start: 1585178916022
Loop start: 1585178916022
Loop start: 1585178916022

Which seems ludicrous and has surprised everybody I've shown this to.


I see two reasonable resolutions in this case:

I think most people who would expect the first behavior, but if you consider the fact that the smoothing rate limiter also uses the warm up period to cool down, it seems undefined how the rate limiter should behave when it can freely warm up and cool down after no time - is it always warm or always cold? Halfway in-between?

CRogers commented 2 years ago

(Apologies for the linked PR spam on this issue, it came from some of our updating infra when someone added this issue to a changelog for one of our core libraries - I've now fixed the changelog to not link here in a way github understands)

natdempk commented 2 months ago

Just adding to this, would love to see this issue fixed as this is basically just a straight up bug in Guava's RateLimiter functionality.

If you write straightfoward code like:

    RateLimiter rateLimiter = RateLimiter.create(
      5.0, // permitsPerSecond
      0, // warmupPeriod
      TimeUnit.SECONDS // unit
    );

    for (int i = 0; i < 10; i++) {
      double acquire = rateLimiter.acquire(5);
      LOG.info("Acquired 5 permits in {} seconds", i, acquire);
    }

You get a RateLimiter that just allows everything through.

2024-08-14 11:43:56.529 [main] INFO  Scratch - Acquired 5 permits 0 in 0.0 seconds
2024-08-14 11:43:56.531 [main] INFO  Scratch - Acquired 5 permits 1 in 0.0 seconds
2024-08-14 11:43:56.531 [main] INFO  Scratch - Acquired 5 permits 2 in 0.0 seconds
2024-08-14 11:43:56.531 [main] INFO  Scratch - Acquired 5 permits 3 in 0.0 seconds
2024-08-14 11:43:56.531 [main] INFO  Scratch - Acquired 5 permits 4 in 0.0 seconds
2024-08-14 11:43:56.531 [main] INFO  Scratch - Acquired 5 permits 5 in 0.0 seconds
2024-08-14 11:43:56.531 [main] INFO  Scratch - Acquired 5 permits 6 in 0.0 seconds
2024-08-14 11:43:56.531 [main] INFO  Scratch - Acquired 5 permits 7 in 0.0 seconds
2024-08-14 11:43:56.531 [main] INFO  Scratch - Acquired 5 permits 8 in 0.0 seconds
2024-08-14 11:43:56.531 [main] INFO  Scratch - Acquired 5 permits 9 in 0.0 seconds

There's no reason to expect this given that 0 is explicitly allowed as input for warmupPeriod and the behavior is unreasonable and unexpected as it just completely violates the passed in rate forever. Setting this to a value like 1 works correctly. Can this please get fixed so that 0 ignores warmup and sets max rate immediately correctly?

This is kind of similar to https://github.com/google/guava/issues/6475 as well.

natdempk commented 2 months ago

@cpovirk saw you are assigned -- curious if you plan to tackle this at some point, and also if this should get a type=defect given this is a straightforward bug?

cpovirk commented 2 months ago

Yes, this does sound like a bug. (I don't think we've made a consistent effort to apply labels for anything except priorities.)

RateLimiter is something that we've never really had time for after its initial development. If there were just one issue, we might prioritize it, but there may be multiple problems (though the other warmup one might well be related), and then implementation is complicated enough that someone would need to take the time to get up to speed. And that's all on top of the feature requests that have piled up.

We've discussed steering users toward a different library for rate limiting. The two that came up in https://github.com/google/guava/issues/5612 and in a separate internal discussion were resilience4j and Failsafe. (I think I heard that the latter is a bit less framework-y, for better or for worse.) But we haven't come up to speed on that, either :(

I don't see much change in the future, sadly.

natdempk commented 2 months ago

No worries that makes sense. Thanks @cpovirk appreciate the context and prompt reply.