mizosoft / methanol

⚗️ Lightweight HTTP extensions for Java
https://mizosoft.github.io/methanol
MIT License
241 stars 12 forks source link

headersTimeout and connectTimeout together create a resource leak #59

Closed rymsha closed 2 years ago

rymsha commented 2 years ago

When connectTimeout headersTimeout are used together

        final HttpRequest request = HttpRequest.newBuilder().uri( URI.create( "https://example.com/" ) ).build();
        final var client = Methanol.newBuilder()
            .connectTimeout( Duration.of( 20000, ChronoUnit.MILLIS ) )
            .headersTimeout( Duration.of( 25000, ChronoUnit.MILLIS ) )
            .build();
        client.send( request, HttpResponse.BodyHandlers.discarding() );

        for ( int i = 0; i < 10; i++ ) {
            System.gc();
            Thread.sleep( 10000 );
        }

HttpClient is never garbage collected: image

I guess a simple workaround is not to use connectTimeout when headersTimeout is set. But I still curious if it can be improved.

Tested on JDK 11 latest.

rymsha commented 2 years ago

BTW, do you think this is a correct diagram?

|---connectTimeout---|---requestTimeout---|
|---------------headersTimeout------------|

Because it looks like (according to this issue https://bugs.openjdk.org/browse/JDK-8258397) requestTimeout actually canceled when headers are received.

mizosoft commented 2 years ago

Hi @rymsha. Does the leak happen when headersTimeout is used without connectTimeout? Or they must be used together? From the screenshot you've provided it appears that HeadersTimeoutInterceptor retains the chain somehow (which has a strong reference to the client). A first guess would be that the system-wide scheduler used for timeouts doesn't lose references to the timeout task when it's cancelled, and the timeout task has an indirect reference to the client.

BTW, do you think this is a correct diagram?
|---connectTimeout---|---requestTimeout---|
|---------------headersTimeout------------|

Because it looks like (according to this issue https://bugs.openjdk.org/browse/JDK-8258397) requestTimeout actually canceled when headers are received.

Hmm, I actually always expected that request timeouts cover receiving the response body. If that's not the case, then there's no functional difference between request & headers timeouts. That'd make the diagram look like this:

|---connectTimeout---|
|---------------requestTimeout-------------|
|---------------headersTimeout------------|

I can verify this is actually the case (headersTimeout & requestTimeout are equivalent). I believe it's somewhat counter-intuitive. There's also an issue to make request timeouts cover receiving response bodies.

rymsha commented 2 years ago

Only when used together. Comment out either connectTimeout or headersTimeout (or both 😀) and the leak disappears.

mizosoft commented 2 years ago

After investing this further, I believe there's no memory leak. However, headers timeout tasks may keep references to HTTP client resources until the timeout is evaluated (regardless of whether the response completed or not). This is because it uses the system-wide scheduler maintained by CompletableFuture, which doesn't make it possible to remove references to scheduled tasks when they're cancelled. If you want to avoid this, you can use a custom scheduler:

var scheduler = new ScheduledThreadPoolExecutor(1);
scheduler.setRemoveOnCancelPolicy(true); // Lose references to timeout tasks when they're cancelled.

var client = Methanol.newBuilder()
    .headersTimeout(Duration.ofSeconds(25), scheduler)
    .build();

BTW the screenshot you've provided shows the references path: ConnectTimerEvent ~> BodyHandler -> InterceptorChain -> HttpClient. The headers timeout interceptor used to reference the chain inside its BodyHandler's lambda to access the original BodyHandler (this is no more the case due to an unrelated change. ConnectTimerEvent is cleared by the HTTP client when either the socket connects or the timeout evaluates, so there's not memory leak here.