jetty / jetty.project

Eclipse Jetty® - Web Container & Clients - supports HTTP/2, HTTP/1.1, HTTP/1.0, websocket, servlets, and more
https://eclipse.dev/jetty
Other
3.84k stars 1.91k forks source link

jetty http2 secure client not initiating TCP connection towards a specific destination #11477

Open souravgupta6nov1992 opened 7 months ago

souravgupta6nov1992 commented 7 months ago

Jetty version(s) 11.0.11 Java version/vendor (use: java -version) 17.0.8 Spring boot version(s) 2.7.15 OS type/version Linux

we are observing that for one of the target fqdn "www.abc.com",when we are trying to send a https request, jetty is not initiating the TCP connection. whereas on the same application, jetty http2 secure client is able to establish TCP connection and send message. Following jetty logs are captured for this scenario.

{"instant":{"epochSecond":1709292837,"nanoOfSecond":282116172},"thread":"thread-1","level":"DEBUG","loggerName":"org.eclipse.jetty.client.HttpDestination","message":"Queued HttpRequest[POST /path1/v1/path2/path3 HTTP/1.1]@187676ab for HttpDestination[Origin@f7f2aa63[https://www.abc.com,tag=null,protocol=Protocol@198cd[proto=[h2],nego=false]]]@74f6b02e,queue=1,pool=MultiplexConnectionPool@24120c40[c=1/1/4,a=0,i=0,q=1]","endOfBatch":false,"loggerFqcn":"org.apache.logging.slf4j.Log4jLogger","threadId":466,"threadPriority":5,"messageId":"msgid1","messageTimestamp":"2024-03-01T11:33:57.282+0000","processId":"1"}
{"instant":{"epochSecond":1709292837,"nanoOfSecond":282152018},"thread":"thread-1","level":"DEBUG","loggerName":"org.eclipse.jetty.client.AbstractConnectionPool","message":"Acquiring create=true on MultiplexConnectionPool@24120c40[c=1/1/4,a=0,i=0,q=1]","endOfBatch":false,"loggerFqcn":"org.apache.logging.slf4j.Log4jLogger","threadId":466,"threadPriority":5,"messageId":"msgid1","messageTimestamp":"2024-03-01T11:33:57.282+0000","processId":"1"}
{"instant":{"epochSecond":1709292837,"nanoOfSecond":282175784},"thread":"thread-1","level":"DEBUG","loggerName":"org.eclipse.jetty.client.AbstractConnectionPool","message":"Try creating connection 1/4 with 1 pending","endOfBatch":false,"loggerFqcn":"org.apache.logging.slf4j.Log4jLogger","threadId":466,"threadPriority":5,"messageId":"msgid1","messageTimestamp":"2024-03-01T11:33:57.282+0000","processId":"1"}
{"instant":{"epochSecond":1709292837,"nanoOfSecond":282213052},"thread":"thread-1","level":"DEBUG","loggerName":"org.eclipse.jetty.client.AbstractConnectionPool","message":"Try creating(true) connection, pending/demand/supply: 1/2/5000, result=false","endOfBatch":false,"loggerFqcn":"org.apache.logging.slf4j.Log4jLogger","threadId":466,"threadPriority":5,"messageId":"msgid1","messageTimestamp":"2024-03-01T11:33:57.282+0000","processId":"1"}

we have also collected the jetty client dump for that fqdn

HttpDestination[Origin@f7f2aa63[https://www.abc.com,tag=null,protocol=Protocol@198cd[proto=[h2],nego=false]]]@74f6b02e,queue=1,pool=MultiplexConnectionPool@24120c40[c=1/1/4,a=0,i=0,q=1] - STARTED
|  += MultiplexConnectionPool@24120c40[c=1/1/4,a=0,i=0,q=1] - STARTED
|  |  +- @722977b5[inUse=0,size=1,max=4,closed=false]
|  |     +> entries size=1
|  |        +> MultiEntry@55afb15b{PENDING,usage=0,multiplex=0,pooled=null}
|  +> exchanges size=1
|     +> HttpExchange@4227825f{req=HttpRequest[POST /path1/v1/path2/path3/delete HTTP/1.1]@3288d69b[PENDING/null] res=HttpResponse[null 0 null]@5344717c[PENDING/null]}
sbordet commented 7 months ago

@souravgupta6nov1992 sorry but it's not clear, and you have reported only parts of what's really necessary to understand the issue.

Can you please report the full client dump at least?

The connection has been initiated by Jetty's client, as reported by the logs.

Likely you have network issues connecting to that destination. Have you tried other means of connecting (e.g. telnet or curl)? Do they succeed?

souravgupta6nov1992 commented 7 months ago

we have checked making direct curl to the destination and it connects on another instance of the same application, we are able to connect to that destination. we obeserver the below jetty logs in the successful and unsuccessful instance of the applications:

successful attempt on a different instance of same application:

{"instant":{"epochSecond":1709560126,"nanoOfSecond":589461610},"thread":"thread-1","level":"DEBUG","loggerName":"org.eclipse.jetty.client.AbstractConnectionPool","message":"Try creating(true) connection, pending/demand/supply: 0/2/0, result=true","endOfBatch":false,"loggerFqcn":"org.apache.logging.slf4j.Log4jLogger","threadId":287,"threadPriority":5}
{"instant":{"epochSecond":1709560126,"nanoOfSecond":589522450},"thread":"thread-1","level":"DEBUG","loggerName":"org.eclipse.jetty.util.Pool","message":"@2c4eae2e[inUse=0,size=1,max=4,closed=false] returning new reserved entry MultiEntry@1f81c114{PENDING,usage=0,multiplex=0,pooled=null}","endOfBatch":false,"loggerFqcn":"org.apache.logging.slf4j.Log4jLogger","threadId":287,"threadPriority":5}

unsuccessful attempt: {"instant":{"epochSecond":1709292837,"nanoOfSecond":282213052},"thread":"thread-1","level":"DEBUG","loggerName":"org.eclipse.jetty.client.AbstractConnectionPool","message":"Try creating(true) connection, pending/demand/supply: 1/2/5000, result=false","endOfBatch":false,"loggerFqcn":"org.apache.logging.slf4j.Log4jLogger","threadId":466,"threadPriority":5}

As we could see here, the result is false for unsuccessful and true for successful. On looking into the jetty lib code at the below snippet: AbstractConnectionPool.tryCreate(boolean)

            int pending = this.pending.get();
            int supply = pending * multiplexed;
            int demand = destination.getQueuedRequestCount() + (create ? 1 : 0);

            boolean tryCreate = isMaximizeConnections() || supply < demand;

            if (LOG.isDebugEnabled())
                LOG.debug("Try creating({}) connection, pending/demand/supply: {}/{}/{}, result={}", create, pending, demand, supply, tryCreate);

            if (!tryCreate)
                return;

we can see that here our supply > demand in the unsuccessful case. also we are assuming that isMaximizeConnections() returns false pending/demand/supply: 1/2/5000 Can you please indicate what could lead to a high supply like this.

Below is our code snippet to create jetty client with https:

    SslContextFactory sslContextFactory = new SslContextFactory.Client(true);
        ClientConnector clientConnector = new ClientConnector() {
            protected void configure(SelectableChannel selectable) throws IOException {
                super.configure(selectable);
                if (selectable instanceof NetworkChannel) {
                    NetworkChannel channel = (NetworkChannel)selectable;
                    channel.setOption(java.net.StandardSocketOptions.SO_KEEPALIVE,
                            true);
                    // Set keepalive parameters only if it is enabled

                        channel.setOption(jdk.net.ExtendedSocketOptions.TCP_KEEPIDLE,
                                180);
                        channel.setOption(jdk.net.ExtendedSocketOptions.TCP_KEEPINTERVAL,
                                Integer.parseInt(1);
                        channel.setOption(jdk.net.ExtendedSocketOptions.TCP_KEEPCOUNT,
                                9);

                }
            }
        };
        clientConnector.setSslContextFactory((SslContextFactory.Client) sslContextFactory);
        sslContextFactory.setIncludeCipherSuites(getCiphers());
        sslContextFactory.setEndpointIdentificationAlgorithm(null);
        try {
            sslContextFactory.setSslContext(getSSLContext());
        } catch (Exception e) {

        }
        HTTP2Client http2Client = new HTTP2Client(clientConnector);
        // HTTP2Client http2Client = new HTTP2Client();
        http2Client.setMaxConcurrentPushedStreams(1000);
        org.eclipse.jetty.http2.client.http.HttpClientTransportOverHTTP2 transport = new org.eclipse.jetty.http2.client.http.HttpClientTransportOverHTTP2(
                http2Client);
        transport.setUseALPN(true);
        org.eclipse.jetty.client.HttpClient httpClient = new org.eclipse.jetty.client.HttpClient(
                transport) {
            @Override
            protected void doStart() throws Exception {
                super.doStart();
            }
        };

        // Adding listener for jetty client metrics
        httpClient.getRequestListeners().add(new JettySSLClientRequestMetrics());
        httpClient.setIdleTimeout(720000);
        httpClient.setMaxRequestsQueuedPerDestination(5000);
        httpClient.setMaxConnectionsPerDestination(4);
        httpClient.setUserAgentField(null);
        httpClient.setRemoveIdleDestinations(true);
        setHttpsclient(httpClient);
        return httpClient;
sbordet commented 7 months ago

Again you have only reported a small fraction of the logs necessary to understand what's going on. Please report the full logs.

You are concentrating on the wrong part of the code, supply=5000 is irrelevant. Did you configure the server with 5000 concurrent streams? If so, that's typically a bad idea because the number is too large for a single connection.

Playing with the KEEP_ALIVE settings is typically a bad idea, I'd remove that part. An idle timeout of 720 seconds is typically way too large, I would use the default.

If you cannot connect (but there is no evidence about that), you likely have network issues.

Please post the full DEBUG logs and we'll be able to help you more.

souravgupta6nov1992 commented 7 months ago

Hi @sbordet ,

Please find below all the logs from jetty i could collect for this request:

{"instant":{"epochSecond":1709292837,"nanoOfSecond":281853018},"thread":"thread-1","level":"DEBUG","loggerName":"org.eclipse.jetty.reactive.client.internal.AbstractSinglePublisher","message":"PublisherContent@2b41e724 subscription from PublisherRequestContent@3923ced4","endOfBatch":false,"loggerFqcn":"org.apache.logging.slf4j.Log4jLogger","threadId":466,"threadPriority":5}
{"instant":{"epochSecond":1709292837,"nanoOfSecond":281888850},"thread":"thread-1","level":"DEBUG","loggerName":"org.eclipse.jetty.client.util.AsyncRequestContent","message":"Content offer succeeded, producing false for AsyncRequestContent@6a2f3088[demand=0,stalled=false,chunks=1]","endOfBatch":false,"loggerFqcn":"org.apache.logging.slf4j.Log4jLogger","threadId":466,"threadPriority":5}
{"instant":{"epochSecond":1709292837,"nanoOfSecond":281935494},"thread":"thread-1","level":"DEBUG","loggerName":"org.eclipse.jetty.reactive.client.internal.AbstractSinglePublisher","message":"ResponseListenerProcessor@3caacedd[Reactive[HttpRequest[POST /path1/v1/path2 HTTP/1.1]@187676ab]] subscription from reactor.core.publisher.MonoIgnoreThen$ThenIgnoreMain@3f845e1f","endOfBatch":false,"loggerFqcn":"org.apache.logging.slf4j.Log4jLogger","threadId":466,"threadPriority":5}
{"instant":{"epochSecond":1709292837,"nanoOfSecond":281967912},"thread":"thread-1","level":"DEBUG","loggerName":"org.eclipse.jetty.reactive.client.internal.ResponseListenerProcessor","message":"sending request Reactive[HttpRequest[POST /path1/v1/path2 HTTP/1.1]@187676ab] from ResponseListenerProcessor@3caacedd[Reactive[HttpRequest[POST /path1/v1/path2 HTTP/1.1]@187676ab]]","endOfBatch":false,"loggerFqcn":"org.apache.logging.slf4j.Log4jLogger","threadId":466,"threadPriority":5}
{"instant":{"epochSecond":1709292837,"nanoOfSecond":282016192},"thread":"thread-1","level":"DEBUG","loggerName":"org.eclipse.jetty.client.HttpClient","message":"Resolved HttpDestination[Origin@f7f2aa63[https://www.abc.com,tag=null,protocol=Protocol@198cd[proto=[h2],nego=false]]]@74f6b02e,queue=0,pool=MultiplexConnectionPool@24120c40[c=1/1/4,a=0,i=0,q=0] for HttpRequest[POST /path1/v1/path2 HTTP/1.1]@187676ab","endOfBatch":false,"loggerFqcn":"org.apache.logging.slf4j.Log4jLogger","threadId":466,"threadPriority":5}
{"instant":{"epochSecond":1709292837,"nanoOfSecond":282067856},"thread":"thread-1","level":"DEBUG","loggerName":"org.eclipse.jetty.client.HttpConversation","message":"Exchanges in conversation 1, override=null, listeners=[org.eclipse.jetty.client.HttpRequest$8@468a7d6b, org.eclipse.jetty.client.HttpRequest$8@6d62983f, org.eclipse.jetty.client.HttpRequest$10@700b472a, org.eclipse.jetty.client.HttpRequest$13@38ef5f06, org.eclipse.jetty.client.HttpRequest$14@2d9b0c55, org.eclipse.jetty.client.HttpRequest$15@5441dab9, org.eclipse.jetty.client.HttpRequest$16@7e59e50c, ResponseListenerProcessor@3caacedd[Reactive[HttpRequest[POST /path1/v1/path2 HTTP/1.1]@187676ab]]]","endOfBatch":false,"loggerFqcn":"org.apache.logging.slf4j.Log4jLogger","threadId":466,"threadPriority":5}

{"instant":{"epochSecond":1709292837,"nanoOfSecond":282116172},"thread":"thread-1","level":"DEBUG","loggerName":"org.eclipse.jetty.client.HttpDestination","message":"Queued HttpRequest[POST /thread-1/v1/path2/path3 HTTP/1.1]@187676ab for HttpDestination[Origin@f7f2aa63[https://www.abc.com,tag=null,protocol=Protocol@198cd[proto=[h2],nego=false]]]@74f6b02e,queue=1,pool=MultiplexConnectionPool@24120c40[c=1/1/4,a=0,i=0,q=1]","endOfBatch":false,"loggerFqcn":"org.apache.logging.slf4j.Log4jLogger","threadId":466,"threadPriority":5,"messageId":"msgid1","messageTimestamp":"2024-03-01T11:33:57.282+0000","processId":"1"}
{"instant":{"epochSecond":1709292837,"nanoOfSecond":282152018},"thread":"thread-1","level":"DEBUG","loggerName":"org.eclipse.jetty.client.AbstractConnectionPool","message":"Acquiring create=true on MultiplexConnectionPool@24120c40[c=1/1/4,a=0,i=0,q=1]","endOfBatch":false,"loggerFqcn":"org.apache.logging.slf4j.Log4jLogger","threadId":466,"threadPriority":5,"messageId":"msgid1","messageTimestamp":"2024-03-01T11:33:57.282+0000","processId":"1"}
{"instant":{"epochSecond":1709292837,"nanoOfSecond":282175784},"thread":"thread-1","level":"DEBUG","loggerName":"org.eclipse.jetty.client.AbstractConnectionPool","message":"Try creating connection 1/4 with 1 pending","endOfBatch":false,"loggerFqcn":"org.apache.logging.slf4j.Log4jLogger","threadId":466,"threadPriority":5,"messageId":"msgid1","messageTimestamp":"2024-03-01T11:33:57.282+0000","processId":"1"}
{"instant":{"epochSecond":1709292837,"nanoOfSecond":282213052},"thread":"thread-1","level":"DEBUG","loggerName":"org.eclipse.jetty.client.AbstractConnectionPool","message":"Try creating(true) connection, pending/demand/supply: 1/2/5000, result=false","endOfBatch":false,"loggerFqcn":"org.apache.logging.slf4j.Log4jLogger","threadId":466,"threadPriority":5,"messageId":"msgid1","messageTimestamp":"2024-03-01T11:33:57.282+0000","processId":"1"}

{"instant":{"epochSecond":1709292840,"nanoOfSecond":281950384},"thread":"parallel-6","level":"DEBUG","loggerName":"org.eclipse.jetty.client.HttpExchange","message":"Failed HttpExchange@6b63f822{req=HttpRequest[POST /path1/v1/path2 HTTP/1.1]@187676ab[COMPLETED/java.util.concurrent.CancellationException] res=HttpResponse[null 0 null]@2c7dbb2a[COMPLETED/java.util.concurrent.CancellationException]}: req=true/rsp=true java.util.concurrent.CancellationException","endOfBatch":false,"loggerFqcn":"org.apache.logging.slf4j.Log4jLogger","threadId":344,"threadPriority":5}

{"instant":{"epochSecond":1709292840,"nanoOfSecond":282381895},"thread":"parallel-6","level":"DEBUG","loggerName":"org.eclipse.jetty.client.HttpExchange","message":"Aborting while queued HttpExchange@6b63f822{req=HttpRequest[POST /path1/v1/path2 HTTP/1.1]@187676ab[COMPLETED/java.util.concurrent.CancellationException] res=HttpResponse[null 0 null]@2c7dbb2a[COMPLETED/java.util.concurrent.CancellationException]}: java.util.concurrent.CancellationException","endOfBatch":false,"loggerFqcn":"org.apache.logging.slf4j.Log4jLogger","threadId":344}
sbordet commented 7 months ago

@souravgupta6nov1992 the 12 lines of logs you posted are not even close to full DEBUG logs. Sorry, but without information we cannot help you.

souravgupta6nov1992 commented 7 months ago

i am running it in kubernetes env. i am enabling debug logs for "org.eclipse.jetty" package with loglevel as DEBUG and for success case i am getting many logs but for the reported failure case i am getting only this much logs.(i am collecting logs based on the same threadid where jetty http2secureclient had send the message to this destination). Do i need to collect logs of other threads also? is there any other log level(trace?) we can set for this package? do i need to set it for any other package. please advise how i can get more logs in this scenario? sample success case log with the same jetty http2 secure client instance on the same application is pasted below. But for failure case i am getting only that 12 lines of log.


{"instant":{"epochSecond":1709624352,"nanoOfSecond":672875867},"thread":"thread-1","level":"DEBUG","loggerName":"org.eclipse.jetty.reactive.client.internal.AbstractSinglePublisher","message":"PublisherContent@69daa35b subscription from PublisherRequestContent@40a4c36d","endOfBatch":false,"loggerFqcn":"org.apache.logging.slf4j.Log4jLogger","threadId":275,}
{"instant":{"epochSecond":1709624352,"nanoOfSecond":672970391},"thread":"thread-1","level":"DEBUG","loggerName":"org.eclipse.jetty.client.util.AsyncRequestContent","message":"Content offer succeeded, producing false for AsyncRequestContent@7918f67f[demand=0,stalled=false,chunks=1]","endOfBatch":false,"loggerFqcn":"org.apache.logging.slf4j.Log4jLogger","threadId":275}
{"instant":{"epochSecond":1709624352,"nanoOfSecond":673109399},"thread":"thread-1","level":"DEBUG","loggerName":"org.eclipse.jetty.reactive.client.internal.AbstractSinglePublisher","message":"ResponseListenerProcessor@589b1315[Reactive[HttpRequest[PUT /path1/path2/path3 HTTP/1.1]@64e7b326]] subscription from reactor.core.publisher.MonoIgnoreThen$ThenIgnoreMain@2beab7c","endOfBatch":false,"loggerFqcn":"org.apache.logging.slf4j.Log4jLogger","threadId":275}
{"instant":{"epochSecond":1709624352,"nanoOfSecond":673192405},"thread":"thread-1","level":"DEBUG","loggerName":"org.eclipse.jetty.reactive.client.internal.ResponseListenerProcessor","message":"sending request Reactive[HttpRequest[PUT /path1/path2/path3 HTTP/1.1]@64e7b326] from ResponseListenerProcessor@589b1315[Reactive[HttpRequest[PUT /path1/path2/path3 HTTP/1.1]@64e7b326]]","endOfBatch":false,"loggerFqcn":"org.apache.logging.slf4j.Log4jLogger","threadId":275}
{"instant":{"epochSecond":1709624352,"nanoOfSecond":673311115},"thread":"thread-1","level":"DEBUG","loggerName":"org.eclipse.jetty.util.component.ContainerLifeCycle","message":"HttpClient@49a26d19{STARTED} added {HttpDestination[Origin@69a2af03[https://www.def.com:8443,tag=null,protocol=Protocol@198cd[proto=[h2],nego=false]]]@3cab1e5a,queue=0,pool=null,MANAGED}","endOfBatch":false,"loggerFqcn":"org.apache.logging.slf4j.Log4jLogger","threadId":275}
{"instant":{"epochSecond":1709624352,"nanoOfSecond":673436687},"thread":"thread-1","level":"DEBUG","loggerName":"org.eclipse.jetty.util.component.AbstractLifeCycle","message":"STARTING HttpDestination[Origin@69a2af03[https://www.def.com:8443,tag=null,protocol=Protocol@198cd[proto=[h2],nego=false]]]@3cab1e5a,queue=0,pool=null","endOfBatch":false,"loggerFqcn":"org.apache.logging.slf4j.Log4jLogger","threadId":275}
{"instant":{"epochSecond":1709624352,"nanoOfSecond":673521170},"thread":"thread-1","level":"DEBUG","loggerName":"org.eclipse.jetty.util.component.ContainerLifeCycle","message":"MultiplexConnectionPool@3ca8a161[c=0/0/4,a=0,i=0,q=0] added {@658f5b51[inUse=0,size=0,max=4,closed=false],POJO}","endOfBatch":false,"loggerFqcn":"org.apache.logging.slf4j.Log4jLogger","threadId":275}
{"instant":{"epochSecond":1709624352,"nanoOfSecond":673617885},"thread":"thread-1","level":"DEBUG","loggerName":"org.eclipse.jetty.util.component.ContainerLifeCycle","message":"HttpDestination[Origin@69a2af03[https://www.def.com:8443,tag=null,protocol=Protocol@198cd[proto=[h2],nego=false]]]@3cab1e5a,queue=0,pool=MultiplexConnectionPool@3ca8a161[c=0/0/4,a=0,i=0,q=0] added {MultiplexConnectionPool@3ca8a161[c=0/0/4,a=0,i=0,q=0],MANAGED}","endOfBatch":false,"loggerFqcn":"org.apache.logging.slf4j.Log4jLogger","threadId":275}
{"instant":{"epochSecond":1709624352,"nanoOfSecond":673686952},"thread":"thread-1","level":"DEBUG","loggerName":"org.eclipse.jetty.util.component.AbstractLifeCycle","message":"STARTING MultiplexConnectionPool@3ca8a161[c=0/0/4,a=0,i=0,q=0]","endOfBatch":false,"loggerFqcn":"org.apache.logging.slf4j.Log4jLogger","threadId":275}
{"instant":{"epochSecond":1709624352,"nanoOfSecond":673744366},"thread":"thread-1","level":"DEBUG","loggerName":"org.eclipse.jetty.util.component.AbstractLifeCycle","message":"STARTED @689833ms MultiplexConnectionPool@3ca8a161[c=0/0/4,a=0,i=0,q=0]","endOfBatch":false,"loggerFqcn":"org.apache.logging.slf4j.Log4jLogger","threadId":275}
{"instant":{"epochSecond":1709624352,"nanoOfSecond":673790306},"thread":"thread-1","level":"DEBUG","loggerName":"org.eclipse.jetty.util.component.AbstractLifeCycle","message":"STARTED @689833ms HttpDestination[Origin@69a2af03[https://www.def.com:8443,tag=null,protocol=Protocol@198cd[proto=[h2],nego=false]]]@3cab1e5a,queue=0,pool=MultiplexConnectionPool@3ca8a161[c=0/0/4,a=0,i=0,q=0]","endOfBatch":false,"loggerFqcn":"org.apache.logging.slf4j.Log4jLogger","threadId":275}
{"instant":{"epochSecond":1709624352,"nanoOfSecond":673845388},"thread":"thread-1","level":"DEBUG","loggerName":"org.eclipse.jetty.client.HttpClient","message":"Created HttpDestination[Origin@69a2af03[https://www.def.com:8443,tag=null,protocol=Protocol@198cd[proto=[h2],nego=false]]]@3cab1e5a,queue=0,pool=MultiplexConnectionPool@3ca8a161[c=0/0/4,a=0,i=0,q=0]","endOfBatch":false,"loggerFqcn":"org.apache.logging.slf4j.Log4jLogger","threadId":275}
{"instant":{"epochSecond":1709624352,"nanoOfSecond":673896021},"thread":"thread-1","level":"DEBUG","loggerName":"org.eclipse.jetty.client.HttpClient","message":"Resolved HttpDestination[Origin@69a2af03[https://www.def.com:8443,tag=null,protocol=Protocol@198cd[proto=[h2],nego=false]]]@3cab1e5a,queue=0,pool=MultiplexConnectionPool@3ca8a161[c=0/0/4,a=0,i=0,q=0] for HttpRequest[PUT /path1/path2/path3 HTTP/1.1]@64e7b326","endOfBatch":false,"loggerFqcn":"org.apache.logging.slf4j.Log4jLogger","threadId":275}
{"instant":{"epochSecond":1709624352,"nanoOfSecond":673977625},"thread":"thread-1","level":"DEBUG","loggerName":"org.eclipse.jetty.client.HttpConversation","message":"Exchanges in conversation 1, override=null, listeners=[org.eclipse.jetty.client.HttpRequest$8@76206c00, org.eclipse.jetty.client.HttpRequest$8@ffa680c, org.eclipse.jetty.client.HttpRequest$10@10b0ef71, org.eclipse.jetty.client.HttpRequest$13@6d9e7ff1, org.eclipse.jetty.client.HttpRequest$14@18ab4e2e, org.eclipse.jetty.client.HttpRequest$15@44cdef3f, org.eclipse.jetty.client.HttpRequest$16@53914bb0, ResponseListenerProcessor@589b1315[Reactive[HttpRequest[PUT /path1/path2/path3 HTTP/1.1]@64e7b326]]]","endOfBatch":false,"loggerFqcn":"org.apache.logging.slf4j.Log4jLogger","threadId":275}
{"instant":{"epochSecond":1709624352,"nanoOfSecond":674046349},"thread":"thread-1","level":"DEBUG","loggerName":"org.eclipse.jetty.client.HttpDestination","message":"Queued HttpRequest[PUT /path1/path2/path3 HTTP/1.1]@64e7b326 for HttpDestination[Origin@69a2af03[https://www.def.com:8443,tag=null,protocol=Protocol@198cd[proto=[h2],nego=false]]]@3cab1e5a,queue=1,pool=MultiplexConnectionPool@3ca8a161[c=0/0/4,a=0,i=0,q=1]","endOfBatch":false,"loggerFqcn":"org.apache.logging.slf4j.Log4jLogger","threadId":275}
{"instant":{"epochSecond":1709624352,"nanoOfSecond":674155797},"thread":"thread-1","level":"DEBUG","loggerName":"org.eclipse.jetty.client.AbstractConnectionPool","message":"Acquiring create=true on MultiplexConnectionPool@3ca8a161[c=0/0/4,a=0,i=0,q=1]","endOfBatch":false,"loggerFqcn":"org.apache.logging.slf4j.Log4jLogger","threadId":275}
{"instant":{"epochSecond":1709624352,"nanoOfSecond":674217251},"thread":"thread-1","level":"DEBUG","loggerName":"org.eclipse.jetty.client.AbstractConnectionPool","message":"Try creating connection 0/4 with 0 pending","endOfBatch":false,"loggerFqcn":"org.apache.logging.slf4j.Log4jLogger","threadId":275}
{"instant":{"epochSecond":1709624352,"nanoOfSecond":674272594},"thread":"thread-1","level":"DEBUG","loggerName":"org.eclipse.jetty.client.AbstractConnectionPool","message":"Try creating(true) connection, pending/demand/supply: 0/2/0, result=true","endOfBatch":false,"loggerFqcn":"org.apache.logging.slf4j.Log4jLogger","threadId":275}
{"instant":{"epochSecond":1709624352,"nanoOfSecond":674319248},"thread":"thread-1","level":"DEBUG","loggerName":"org.eclipse.jetty.client.AbstractConnectionPool","message":"Creating connection 0/4 at MultiEntry@4deec7c4{PENDING,usage=0,multiplex=0,pooled=null}","endOfBatch":false,"loggerFqcn":"org.apache.logging.slf4j.Log4jLogger","threadId":275}
{"instant":{"epochSecond":1709624352,"nanoOfSecond":674381568},"thread":"thread-1","level":"DEBUG","loggerName":"org.eclipse.jetty.util.thread.QueuedThreadPool","message":"queue org.eclipse.jetty.util.SocketAddressResolver$Async$$Lambda$1727/0x00007effa4c69120@ff63a69 startThread=0","endOfBatch":false,"loggerFqcn":"org.apache.logging.slf4j.Log4jLogger","threadId":275}
{"instant":{"epochSecond":1709624352,"nanoOfSecond":702773809},"thread":"@49a26d19-310","level":"DEBUG","loggerName":"org.eclipse.jetty.client.HttpSender","message":"Request headers HttpRequest[PUT /path1/path2/path3 HTTP/2.0]","threadId":310}
{"instant":{"epochSecond":1709624352,"nanoOfSecond":707593522},"thread":"@49a26d19-310","level":"DEBUG","loggerName":"org.eclipse.jetty.http2.hpack.HpackContext","message":"HdrTbl[63268899] added {D,10,x-request-id: 892f1ef5-f9c9-425a-b2db-643f32676d79,36df603d}","endOfBatch":false,"loggerFqcn":"org.apache.logging.slf4j.Log4jLogger","threadId":310}
{"instant":{"epochSecond":1709624352,"nanoOfSecond":707666655},"thread":"@49a26d19-310","level":"DEBUG","loggerName":"org.eclipse.jetty.http2.hpack.HpackEncoder","message":"encode LitHuffNHuffVIdx:'x-request-id: 892f1ef5-f9c9-425a-b2db-643f32676d79' to ","endOfBatch":false,"loggerFqcn":"org.apache.logging.slf4j.Log4jLogger","threadId":310}
sbordet commented 7 months ago

In the logs of your last comment, the destination to https://www.def.com:8443 is using HTTP/2 with no negotiation.

The DNS resolution for that destination is started, but then there are not enough log lines to tell whether it succeeded. Basically what's happening is either the DNS resolution is not working, or it is taking a long time. Apparently typical k8s problems.

There is nothing that we can do: Jetty is working fine, but your network is not. Or, you have not given us enough log lines to understand more.

You should not restrict the logs to only one thread-id, just give us all the logs, would you? DEBUG level is what's needed.

souravgupta6nov1992 commented 7 months ago

@sbordet PFA [ jettyliblogs.log ] our full logs containing all 'org.eclipse.jetty' logs 'app1.ns-1.domain.clustername' is the fqdn for unsuccessful attempts and '/app1-app1path1/v1/app1path2' is the uri path.

We collected TCP dumps for the same but we could not see DNS happening as well as no TCP SYN packet. We tried a curl inside the pod with the same fqdn and it is able to resolve the ip and successfully establsih a connection That is why we are thinking that jetty is not attempting to establish the connection.

We also tried making the request in our app directly with ip instead of fqdn but observed the same behaviour. PFA the logs for IP based request instead of fqdn. ipbasedjetty.log

souravgupta6nov1992 commented 7 months ago

@sbordet today i tried to simulate this scenario in my setup with following steps: blocked TCP SYN packet in the remote http/2 TLS server. this is to ensure TCP connection from my pod(jetty http2 tls client) to remote server always fails. with above setup i tried to send an https message to the remote server and the connection establish ment failed. first message processed by jetty http/2 client, went to establish TCP connection Try creating connection 0/4 with 0 pending 2nd message got queued and the following log came "Try creating connection 1/4 with 1 pending" Try creating(true) connection, pending/demand/supply: 1/3/5000

The above 2 logs are repeated for the 3rd and 4th message

this condition got cleared after 10 min since the idletimeout of jetty client is set to 10 min Post this, when i again sent few messages the above condition occured again

I am wondering if this condition never gets cleared then there will be no connection attempts made for this destination. Ideally it should be cleared due to some errors(timeouts or some other exceptions) but looks like in the original issue which i reported this condition never got cleared.

One condition i can think of is if idletimeout is very long(or not set), in that case the connection might get stuck in the pending state. Is there a way the idletimeout can be disabled?

Could you please provide input on this observation.

joakime commented 7 months ago

I am wondering if this condition never gets cleared then there will be no connection attempts made for this destination. Ideally it should be cleared due to some errors(timeouts or some other exceptions) but looks like in the original issue which i reported this condition never got cleared.

DNS lookup in the JVM is a blocking call that we cannot interrupt or cancel. There have been reports of the DNS lookup never returning on badly configured systems.

kdtech1001 commented 7 months ago

@joakime @sbordet In the original reported issue, the same jetty client is working fine for various different destinations. And no sign of stuck thread, hence, I am not very sure that it is due to a bad DNS config.

But even if that is the case, or some other case like SYN packet timeout, shouldn't there be a way in Jetty to come out of this situation and retry for a new connection. Like, using an async DNS Resolver with a timeout (if not already used) ?

In the above testing, the connection which is in pending state , comes out of this pending state after idleTimeout expiry. We will test by setting connectionTimeout as well.

Few more questions :

  1. Is there a way to disable idletimeout in jetty http/2 client?
  2. Moreover, the jetty http2 non TLS client we use invokes start(), but the same is not there in https client, do you suggest to invoke start() method ?
joakime commented 7 months ago

But even if that is the case, or some other case like SYN packet timeout, shouldn't there be a way in Jetty to come out of this situation and retry for a new connection.

Per the observation of the logs by @sbordet at https://github.com/jetty/jetty.project/issues/11477#issuecomment-1979675926 The connection does not appear to even have been attempted, but DNS was started, and no evidence in the abbreviated logs that the DNS succeeded. If there was no connection attempt, then there would obviously be no SYN, no idle timeout, none of the post-connection features you are trying to rely on. Without DNS we have no IP to attempt to connect to. Retry has no meaning if DNS fails.

Like, using an async DNS Resolver with a timeout (if not already used) ?

There is no Async DNS Resolver in Java.

While there have been several attempts at 3rd party libs to do DNS in various ways, none have been successful.

To be clear, I'm not saying DNS is your issue, it's just that the information we have so far is pointing that way. And a DNS based issue would explain the behavior you are seeing easily. Also, our own build infrastructure is based on k8s, and we have DNS issues with our k8s nodes about once a month, we have seen this kind of DNS issue ourselves.

kdtech1001 commented 7 months ago

@joakime We have seen the same issue with IP address as well (where address is already resolved).

While browsing the code I came across the following class in Jetty which seems to be trying to do DNS resolution with a timeout ? where this is used ?

@ManagedObject("The asynchronous address resolver") public static class Async implements SocketAddressResolver {

I am seeing a call of it from httpClient.java class in start() function if (resolver == null) setSocketAddressResolver(new SocketAddressResolver.Async(getExecutor(), getScheduler(), getAddressResolutionTimeout()));

sbordet commented 6 months ago

Your logs show that there is a problem with DNS: there are attempts to create connections, but they never return or otherwise succeed, nor fail.

You can try SocketAddressResolver.Async and specify a timeout, which would at least report the timeout to your application (while the thread doing DNS will still be stuck).