getsentry / sentry-java

A Sentry SDK for Java, Android and other JVM languages.
https://docs.sentry.io/
MIT License
1.16k stars 435 forks source link

Unnecessary Blocking in UncaughtExceptionHandlerIntegration When RateLimiter Is Active #3857

Closed kkrawczy closed 14 hours ago

kkrawczy commented 2 weeks ago

Integration

sentry

Java Version

19

Version

7.16.0

Steps to Reproduce

  1. Configure your DSN with a low limit, 1.
  2. Run a script that triggers many uncaughtException and observe how slow the handling gets. This line in the console can confirm slowdown: Timed out waiting to flush event to disk before crashing. Event: %s Simple script:

    public class SentryTest {
    public static void main(String[] args) throws InterruptedException {
        Sentry.init(options -> {
            options.setDsn("XYZ");
            options.setEnvironment("XYZ");
            options.setDebug(true);
        });
        for(int i = 0; i < 100; i++) {
            System.out.println("Iteration: " + i);
            Thread workerThread = new Thread(() -> {
                Thread thread = Thread.currentThread();
                thread.getUncaughtExceptionHandler().uncaughtException(thread, new Exception("This is a test exception"));
            });
    
            long startTime = System.nanoTime();
            workerThread.start();
            workerThread.join();
    
            long durationInMillis = (System.nanoTime() - startTime) / 1_000_000;
    
            if (durationInMillis > 1000) {
                throw new IllegalStateException("Sentry blocked the thread");
            }
        }
    }
    }

Expected Result

Don't block the thread, which could potentially slow down different parts of the user app.

Most probably, the latch used by BlockingFlushHint should be properly released in RateLimiter like you already do for other hints (markHintWhenSendingFailed)

Actual Result

The thread blocks for 15 seconds, blocking other parts of the user application.

The workaround could be to reduce flushTimeoutMillis.

adinauer commented 1 week ago

Thanks for opening this issue @kkrawczy, we'll look into it and update here once we did, can't say when we'll get to it yet.

adinauer commented 1 week ago

Thanks for the repro and your investigations @kkrawczy ! I've just opened a PR to fix this which gets rid of the blocking. The first captured event takes a bit of time due to initializing the hostname cache. If you don't need the hostname and want to make it faster, you could set options.setAttachServerName(false); in your Sentry.init block.

adinauer commented 14 hours ago

We have just released 8.0.0-beta.3 which includes a fix for this issue. If you decide to give it a try, feedback would be great.