spring-cloud / spring-cloud-netflix

Integration with Netflix OSS components
http://cloud.spring.io/spring-cloud-netflix/
Apache License 2.0
4.88k stars 2.44k forks source link

RestTemplateEurekaHttpClient not closing HttpClient on shutdown #4103

Open BenEfrati opened 2 years ago

BenEfrati commented 2 years ago

Spring Cloud 2021.0.0 Spring Boot 2.6.8

Bug Description: RestTemplateEurekaHttpClient is not closing HttpClient on shutdown. This leads to TCP CLOSE_WAIT connections to eureka server.

EurekaClient will shutdown when an exception occurs on an http request, but not shutdown HttpClient.

This bug is related to https://github.com/spring-cloud/spring-cloud-netflix/blob/27ac3379f6b391ccd605d212512f15c439825ecb/spring-cloud-netflix-eureka-client/src/main/java/org/springframework/cloud/netflix/eureka/http/RestTemplateEurekaHttpClient.java#L203

In case of exception here: https://github.com/Netflix/eureka/blob/ed0da19ca1c049c87e3dbf75b6015c1861d5c2d0/eureka-client/src/main/java/com/netflix/discovery/shared/transport/decorator/RedirectingEurekaHttpClient.java#L96 new HttpClient will be created without closing the existing one - this causes CLOSE_WAIT connections

This supplier creates new CloseableHttpClient for every call to https://github.com/spring-cloud/spring-cloud-netflix/blob/27ac3379f6b391ccd605d212512f15c439825ecb/spring-cloud-netflix-eureka-client/src/main/java/org/springframework/cloud/netflix/eureka/http/RestTemplateTransportClientFactory.java#L103

public class DefaultEurekaClientHttpRequestFactorySupplier implements EurekaClientHttpRequestFactorySupplier {

    @Override
    public ClientHttpRequestFactory get(SSLContext sslContext, @Nullable HostnameVerifier hostnameVerifier) {
        HttpClientBuilder httpClientBuilder = HttpClients.custom();
        if (sslContext != null) {
            httpClientBuilder = httpClientBuilder.setSSLContext(sslContext);
        }
        if (hostnameVerifier != null) {
            httpClientBuilder = httpClientBuilder.setSSLHostnameVerifier(hostnameVerifier);
        }
        CloseableHttpClient httpClient = httpClientBuilder.build();
        HttpComponentsClientHttpRequestFactory requestFactory = new HttpComponentsClientHttpRequestFactory();
        requestFactory.setHttpClient(httpClient);
        return requestFactory;
    }

}

so in case of shutdown, currentEurekaClient shutdown don't closes connections: https://github.com/spring-cloud/spring-cloud-netflix/blob/27ac3379f6b391ccd605d212512f15c439825ecb/spring-cloud-netflix-eureka-client/src/main/java/org/springframework/cloud/netflix/eureka/http/RestTemplateEurekaHttpClient.java#L203

possible solution will be trying to close the HttpClient: shutdown could be

public void shutdown() {
        Optional.of(unwrapRequestFactoryIfNecessary(restTemplate.getRequestFactory()))
                .filter(HttpComponentsClientHttpRequestFactory.class::isInstance)
                .map(HttpComponentsClientHttpRequestFactory.class::cast)
                .ifPresent(requestFactory-> {
                    try {
                        requestFactory.destroy();
                    } catch (Exception e) {
                    }
                });
    }

unwrapRequestFactoryIfNecessary https://github.com/spring-projects/spring-boot/blob/47516b50c39bd6ea924a1f6720ce6d4a71088651/spring-boot-project/spring-boot/src/main/java/org/springframework/boot/web/client/RestTemplateBuilder.java#L746

spring-projects/spring-boot#31075

As you can see process open files increases over time until GC occurred image

image

We can also create EurekaClientHttpRequestFactorySupplier which return the same ClientHttpRequestFactory, but it not a stable solution since we can't control eureka client code, maybe in case of exceptions connections not closes (hence not returns to PoolingHttpClientConnectionManager) - this can lead to no available connections in pool. In that case, restart is required

Original Issue: https://github.com/spring-cloud/spring-cloud-netflix/issues/4062

BenEfrati commented 2 years ago

https://github.com/spring-projects/spring-boot/issues/31075#issuecomment-1128766582

the piece code instantiating the request factory should be in charge of closing resources properly

OlgaMaciaszek commented 2 years ago

Hello, @BenEfrati , thanks for reporting it. Looks like a bug.

sdzx3783 commented 2 years ago

Recently, I encountered this problem in the production environment. I found that the HttpClient was created many times. The HttpClient connection pool did not enable the thread for clearing expired connections. The problem was solved by sharing one HttpClient and enabling the IdleConnectionEvaluator thread. The code is as follows:


@Configuration
public class EurekaConfig {

    private final AtomicReference<CloseableHttpClient> ref = new AtomicReference<>();

    @Bean
    public EurekaClientHttpRequestFactorySupplier defaultEurekaClientHttpRequestFactorySupplier() {
        return (sslContext, hostnameVerifier) -> {
            HttpComponentsClientHttpRequestFactory requestFactory = new HttpComponentsClientHttpRequestFactory();
            HttpClientBuilder httpClientBuilder = HttpClients
                    .custom()
                    .evictExpiredConnections()
                    .evictIdleConnections(30L, TimeUnit.SECONDS);
            if (sslContext != null) {
                httpClientBuilder = httpClientBuilder.setSSLContext(sslContext);
            }
            if (hostnameVerifier != null) {
                httpClientBuilder = httpClientBuilder.setSSLHostnameVerifier(hostnameVerifier);
            }
            if (ref.get() == null) {
                ref.compareAndSet(null, httpClientBuilder.build());
            }
            requestFactory.setHttpClient(ref.get());
            return requestFactory;
        };
    }
}
`
OlgaMaciaszek commented 2 years ago

Hello @sdzx3783 - thanks for your comment. Would you like to submit a PR with a fix?

sdzx3783 commented 2 years ago

I provide a temporary solution without changing the source code

dangkhoaphung commented 1 year ago

Hi @OlgaMaciaszek , may I know which version will include the fix?

OlgaMaciaszek commented 1 year ago

@dangkhoaphung if the issue is added to an active backlog, it will be assigned a milestone. We're now prioritising tasks for the upcoming release and will circle back to non-blocking issues later on. However, feel free to create a PR, and we will definitely review it.

OlgaMaciaszek commented 7 months ago

Initial fix reverted due to https://github.com/spring-cloud/spring-cloud-netflix/issues/4275. Reopening.