phax / phase4

phase4 - AS4 client and server for integration into existing systems. Specific support for Peppol and CEF eDelivery built-in.
Apache License 2.0
154 stars 48 forks source link

Problem with OCSP certificate revocation check in Java 17u4 and later #124

Closed Florianisme closed 1 month ago

Florianisme commented 1 year ago

Hi Philip, I'm currently encountering an issue while testing with the new 2.1.0 release. I'm using Spring Boot 3.

When I start the Access Point and send a message to customer A, the transmission is successful. While the application is still running, I send a message to customer B. The transmission fails with the following error:

com.helger.phase4.peppol.Phase4PeppolException: The configured receiver AP certificate is not valid (at 2023-05-05T09:28:22.546313412+02:00) and cannot be used for sending. Aborting. Reason: certificate is revoked at com.helger.phase4.peppol.Phase4PeppolSender._checkReceiverAPCert(Phase4PeppolSender.java:280 undefined) ~[phase4-peppol-client-2.1.0.jar!/:2.1.0] at com.helger.phase4.peppol.Phase4PeppolSender$AbstractPeppolUserMessageBuilder.finishFields(Phase4PeppolSender.java:652 undefined) ~[phase4-peppol-client-2.1.0.jar!/:2.1.0] at com.helger.phase4.peppol.Phase4PeppolSender$Builder.finishFields(Phase4PeppolSender.java:1061 undefined) ~[phase4-peppol-client-2.1.0.jar!/:2.1.0] at com.helger.phase4.sender.AbstractAS4MessageBuilder.sendMessage(AbstractAS4MessageBuilder.java:605 undefined) ~[phase4-lib-2.1.0.jar!/:2.1.0]

If however I restart the application and send a message to customer B first, it succeeds. When I try to send a message to customer A then, it fails with the same exception.

The transmission keeps failing until I restart the application. Let me know if you need additional logs. I skimmed through the code of phase4 and peppol-commons but did not find any obvious error yet.

phax commented 1 year ago

@Florianisme this is indeed weird. Can you please turn on debug logging and post what happens before? Or do you see anything weird regarding to certificate validation on the first message sending?

Florianisme commented 1 year ago

I'm running with Debug enabled all the time in our test instance. I have attached a log from the complete SMP lokup. AP_certificate_revoked.txt Note that I'm running in Test mode, so it's using the SMK.

I sadly can't see anything wrong with the initial validation. I looks like it could be an issue witht the revocation cache, did anything significantly change since the last 1.4.x release?

phax commented 1 year ago

I see, I need to improve debug logging when it comes to certificate checking. There were no significant changes in my code from 1.4.x to 2.x, but due to the Java version change from 1.8 to 11 there might have been a change there.

Anyhow, this seems to be the debug log of the second request that fails - most likely because the cached result is returned.

Can you please provide a similar debug log for the first transmission request? Thanks

phax commented 1 year ago

Existing debug log messages should already be emitted. Search e.g. for Performing certificate revocation check on certificate

Florianisme commented 1 year ago

Will add a log, which contains the initial fail

I does print that in the successful log AP_certificate_ok.txt. See that I'm querying the same Peppol Receiver ID.. :)

I actually had to switch to Java 17 because Spring 3.0 requires it. I'm also running behind a proxy if that is of any interest. Though I have set the properties http.proxyHost, http.proxy.host (for ports and https respectively)

phax commented 1 year ago

Yes, that looks pretty okay.

As a workaround please call PeppolCertificateChecker.setCacheOCSPResults(false) to disable OCSP result caching in general

Florianisme commented 1 year ago

That does seem to work for now. Though I could not reproduce the strange behavior I described initally before I made the change. I'll do some further testing next week. Thanks!

Florianisme commented 1 year ago

I have enabled the java.security.debug=certpath and javax.net.debug=all properties and this is what I found in the log: certpath: connecting to OCSP service at: http://pki-ocsp.symauth.com certpath: RevocationChecker.check() Unable to determine revocation status due to network error

So I'm guessing nothing you can do :) Sorry to have bothered you!

phax commented 1 year ago

Thanks for digging deep into this issue. So it really seems to be an issue with the outbound proxy on your side....

https://bugs.openjdk.org/browse/JDK-8132011 may be a good hint to see what is necessary to enable a Proxy in OCSP - it seems like this can only be achieved via the global system properties.

Florianisme commented 1 year ago

I am doing that already, though I might have to dig even deeper now. Did your library use CRL instead of OCSP before by any means?

I don't get why it was working perfectly fine before..

phax commented 1 year ago

I am currently only using OCSP and not CRL. Maybe some inhouse changes???

Florianisme commented 1 year ago

I'm guessing it's related to our network. Some OCSP requests succeed so my implementaion can't be that wrong.. :)

phax commented 1 year ago

I can confirm, that the OSCP lookup has some issues when using with Java 17 - it really seems to be quite indeterministic when it works and when it doesn't work.

Florianisme commented 1 year ago

That's interesting. I was fully blaming our infrastructure for the intermittent issues. I dug through the Java implementation of the OCSP check but could not find a reason why it would fail. I sadly also didn't find anything while googling yet.

The thing I noticed was that when it fails, it fails quickly (like tens of milliseconds quickly).

phax commented 1 year ago

I enabled the system property -Djava.security.debug=certpath and in case of error, this is the debug trace I see:

certpath: ---checking signature...
certpath: signature verified.
certpath: BasicChecker.updateState issuer: CN=PEPPOL ACCESS POINT TEST CA - G2, OU=FOR TEST ONLY, O=OpenPEPPOL AISBL, C=BE; subject: CN=PJP000572, OU=PEPPOL TEST AP, O="Mirai Communication Network, Inc.", C=JP; serial#: 86370016705632901727484386645577705200
certpath: RevocationChecker.check: checking cert
  SN:     40fa4298 ff6ca832 2c974987 bdb626f0
  Subject: CN=PJP000572, OU=PEPPOL TEST AP, O="Mirai Communication Network, Inc.", C=JP
  Issuer: CN=PEPPOL ACCESS POINT TEST CA - G2, OU=FOR TEST ONLY, O=OpenPEPPOL AISBL, C=BE
certpath: connecting to OCSP service at: http://pki-ocsp.symauth.com
certpath: RevocationChecker.check() Unable to determine revocation status due to network error
certpath: RevocationChecker.check() preparing to failover

A similar positive case looks like this:

certpath: -Using checker7 ... [sun.security.provider.certpath.RevocationChecker]
certpath: RevocationChecker.check: checking cert
  SN:     39f4fc3d 66ab97be 24e10f47 520ee7e7
  Subject: CN=POP000270, OU=PEPPOL PRODUCTION AP, O=BRZ GmbH, C=AT
  Issuer: CN=PEPPOL ACCESS POINT CA - G2, O=OpenPEPPOL AISBL, C=BE
certpath: connecting to OCSP service at: http://pki-ocsp.symauth.com
certpath: OCSP response status: SUCCESSFUL
certpath: OCSP response type: basic
certpath: Responder ID: byKey: 7283244128DE19362F3F88EBD4F0EAAEB366A0AF
certpath: OCSP response produced at: Tue May 16 12:30:17 UTC 2023
certpath: OCSP number of SingleResponses: 1
certpath: thisUpdate: Tue May 16 12:30:17 UTC 2023
certpath: nextUpdate: Tue May 23 12:30:17 UTC 2023
certpath: OCSP response cert #1: CN=PEPPOL ACCESS POINT CA - G2 OCSP Responder 2023-03-06T17:41:29Z, O=OpenPEPPOL AISBL, C=BE

on my local machine (Java 17.0.4.101-hotspot Win64) I can boil the error down to this exception:

certpath: connecting to OCSP service at: http://pki-ocsp.symauth.com
java.security.cert.CertPathValidatorException: Unable to determine revocation status due to network error
    at java.base/sun.security.provider.certpath.OCSP.check(OCSP.java:202)
    at java.base/sun.security.provider.certpath.RevocationChecker.checkOCSP(RevocationChecker.java:785)
    at java.base/sun.security.provider.certpath.RevocationChecker.check(RevocationChecker.java:369)
    at java.base/sun.security.provider.certpath.RevocationChecker.check(RevocationChecker.java:343)
    at java.base/sun.security.provider.certpath.SunCertPathBuilder.depthFirstSearchForward(SunCertPathBuilder.java:426)
    at java.base/sun.security.provider.certpath.SunCertPathBuilder.depthFirstSearchForward(SunCertPathBuilder.java:528)
    at java.base/sun.security.provider.certpath.SunCertPathBuilder.buildForward(SunCertPathBuilder.java:225)
    at java.base/sun.security.provider.certpath.SunCertPathBuilder.buildCertPath(SunCertPathBuilder.java:160)
    at java.base/sun.security.provider.certpath.SunCertPathBuilder.build(SunCertPathBuilder.java:131)
    at java.base/sun.security.provider.certpath.SunCertPathBuilder.engineBuild(SunCertPathBuilder.java:126)
    at java.base/java.security.cert.CertPathBuilder.build(CertPathBuilder.java:297)
    at com.helger.peppol.utils.CertificateRevocationChecker$AbstractRevocationCheckBuilder.lambda$build$1(CertificateRevocationChecker.java:659)
    at com.helger.peppol.utils.CertificateRevocationChecker$AbstractRevocationCheckBuilder.build(CertificateRevocationChecker.java:683)
    at com.helger.peppol.utils.PeppolCertificateChecker.checkCertificate(PeppolCertificateChecker.java:443)
    at com.helger.peppol.utils.PeppolCertificateChecker.checkPeppolAPCertificate(PeppolCertificateChecker.java:474)
    at com.helger.peppol.pub.PagePublicToolsParticipantInformation._queryParticipant(PagePublicToolsParticipantInformation.java:1003)
    at com.helger.peppol.pub.PagePublicToolsParticipantInformation.fillContent(PagePublicToolsParticipantInformation.java:1280)
    at com.helger.peppol.pub.PagePublicToolsParticipantInformation.fillContent(PagePublicToolsParticipantInformation.java:1)
    at com.helger.photon.uicore.page.AbstractWebPage.getContent(AbstractWebPage.java:162)
    at com.helger.photon.bootstrap4.uictrls.ext.BootstrapPageRenderer.getPageContent(BootstrapPageRenderer.java:133)
    at com.helger.photon.bootstrap4.uictrls.ext.BootstrapPageRenderer.getPageContent(BootstrapPageRenderer.java:160)
    at com.helger.peppol.pub.LayoutAreaContentProviderPublic.getPageContent(LayoutAreaContentProviderPublic.java:236)
    at com.helger.peppol.pub.LayoutAreaContentProviderPublic.getContent(LayoutAreaContentProviderPublic.java:299)
    at com.helger.peppol.ui.AppLayoutHTMLProvider.fillBody(AppLayoutHTMLProvider.java:68)
    at com.helger.photon.core.html.AbstractSWECHTMLProvider.fillHeadAndBody(AbstractSWECHTMLProvider.java:106)
    at com.helger.photon.core.html.AbstractHTMLProvider.createHTML(AbstractHTMLProvider.java:164)
    at com.helger.photon.app.html.PhotonHTMLHelper.createHTMLResponse(PhotonHTMLHelper.java:117)
    at com.helger.photon.core.servlet.AbstractApplicationXServletHandler.handleRequest(AbstractApplicationXServletHandler.java:102)
    at com.helger.xservlet.handler.simple.XServletHandlerToSimpleHandler.onRequest(XServletHandlerToSimpleHandler.java:238)
    at com.helger.xservlet.AbstractXServlet._invokeHandler(AbstractXServlet.java:354)
    at com.helger.xservlet.AbstractXServlet.service(AbstractXServlet.java:539)
    at jakarta.servlet.http.HttpServlet.service(HttpServlet.java:587)
    at com.helger.xservlet.AbstractXServlet.service(AbstractXServlet.java:595)
    at org.eclipse.jetty.servlet.ServletHolder$NotAsync.service(ServletHolder.java:1410)
    at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:764)
    at org.eclipse.jetty.servlet.ServletHandler$ChainEnd.doFilter(ServletHandler.java:1665)
    at com.helger.servlet.filter.CharacterEncodingFilter.doHttpFilter(CharacterEncodingFilter.java:184)
    at com.helger.servlet.filter.AbstractHttpServletFilter.doFilter(AbstractHttpServletFilter.java:66)
    at org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:202)
    at org.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1635)
    at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:527)
    at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:131)
    at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:578)
    at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:122)
    at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:223)
    at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1570)
    at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:221)
    at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1380)
    at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:176)
    at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:484)
    at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1543)
    at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:174)
    at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1302)
    at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:129)
    at org.eclipse.jetty.server.handler.HandlerList.handle(HandlerList.java:51)
    at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:122)
    at org.eclipse.jetty.server.Server.handle(Server.java:563)
    at org.eclipse.jetty.server.HttpChannel.lambda$handle$0(HttpChannel.java:505)
    at org.eclipse.jetty.server.HttpChannel.dispatch(HttpChannel.java:762)
    at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:497)
    at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:282)
    at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:314)
    at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:100)
    at org.eclipse.jetty.io.SelectableChannelEndPoint$1.run(SelectableChannelEndPoint.java:53)
    at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.runTask(AdaptiveExecutionStrategy.java:416)
    at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.consumeTask(AdaptiveExecutionStrategy.java:385)
    at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.tryProduce(AdaptiveExecutionStrategy.java:272)
    at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.lambda$new$0(AdaptiveExecutionStrategy.java:140)
    at org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:411)
    at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:934)
    at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1078)
    at java.base/java.lang.Thread.run(Thread.java:833)
Caused by: java.io.EOFException
    at java.base/sun.security.util.IOUtils.readExactlyNBytes(IOUtils.java:61)
    at java.base/sun.security.provider.certpath.OCSP.getOCSPBytes(OCSP.java:282)
    at java.base/sun.security.provider.certpath.OCSP.check(OCSP.java:195)
    ... 71 more

So eventually we do need to take into consideration, that the OSCP Service at http://pki-ocsp.symauth.com has issues????

phax commented 1 year ago

Ha, here we have the source of the error: the expected length of 0x7fffffff is a bit too much: grafik The actual data read has 2277 bytes.

The expected length is the "content length" which is missing and therefore set to Integer.MAX_VALUE.

grafik

Indeed, no Content-Length header is returned:

grafik

phax commented 1 year ago

Comparing it to a successful response, the Content-Length header is contained and has the value of 2277:

grafik

So it is indeed an error at the OSCP service of DigiCert, that they are not consistently returning the Content-Length header would now be my conclusion.

phax commented 1 year ago

The reading code in Java 11.0.16 looks like this:

grafik

It deals with the "no Content-Length" situation totally different. I think I file a JDK bug

phax commented 1 year ago

The change was introduced from 17b3 to 17b4:

https://github.com/openjdk/jdk/blob/jdk-17%2B3/src/java.base/share/classes/sun/security/provider/certpath/OCSP.java#L262-L274

https://github.com/openjdk/jdk/blob/jdk-17%2B4/src/java.base/share/classes/sun/security/provider/certpath/OCSP.java#L271-L277

phax commented 1 year ago

The commit that changed it was https://github.com/openjdk/jdk/commit/f5ee356540d7aa4a7663c0d5d74f5fdb0726b426 as a result for https://bugs.openjdk.org/browse/JDK-8179503

phax commented 1 year ago

I filed a bug in the Java bug database. Internal ID 9075275 - under review.

Florianisme commented 1 year ago

Wow, nice investigation! I don't know why but my stacktraces were cut off before the interesting stuff.. Let's see what will come out of this.

phax commented 1 year ago

(I created the Stack traces from within the debugger with "evaluate expression")

nbredenbals commented 1 year ago

The server behaves correctly IMHO, as there is either a Content-Length or a Transfer-Encoding header (presumeably "chunked"), so I guess we have to wait for a JDK update here. Thanks for the analysis, followed with interest :)

phax commented 1 year ago

The bug has been accepted and you can follow under https://bugs.java.com/bugdatabase/view_bug?bug_id=JDK-8308255

Florianisme commented 1 year ago

PR for the fix is already opened https://github.com/openjdk/jdk17u-dev/pull/1361

phax commented 1 year ago

The fix is supposed to be in Java 17.0.8 which is scheduled to be released 2023-07-18 according to https://www.java.com/releases/

Florianisme commented 1 year ago

Do you know when the Docker images will be updated? I did not find a schedule for any of the more popular JRE images

phax commented 1 year ago

No idea - sorry. I think that is vendor dependent but will happen asap (I hope ;-) )

phax commented 1 year ago

So the Adoptium update 17.0.8 is out. I am closing this issue now.

phax commented 1 year ago

To me it looks like the issue is still persistent (in a way) in 17.0.8 - need to investigate

Florianisme commented 1 year ago

I'm still waiting for release of the Semeru Images so I could not proceed with my tests yet. How did you find out it's still an issue?

phax commented 1 year ago

I've update the Java version on peppol.helger.com and still getting the same errors.

Lennert-vw commented 1 year ago

Any updates on this or way around it? I'm currently using .checkReceiverAPCertificate(false) but want to use the certificateConsumer in order to grab the receiving certificateCN (will need it next year for peppol reporting).

phax commented 1 year ago

Not really, unfortunately, It seems like the Digicert OCSP responder as used by Peppol has some availabiltity problems (that we however cannot prove properly) - see also #155

Regarding the problem, that the certificate callback is not invoked when .checkReceiverAPCertificate(false) is set, I will file a new issue

phax commented 1 year ago

I spent some time debugging the issue, but I don't get any closer. To me it seems like the issue is somehow system specific. When enabling debug logging via -Djava.security.debug=certpath this is what I get in case of error:

certpath: ---checking signature...
certpath: signature verified.
certpath: BasicChecker.updateState issuer: CN=PEPPOL ACCESS POINT CA - G2, O=OpenPEPPOL AISBL, C=BE; subject: CN=PBE000076, OU=PEPPOL PRODUCTION AP, O=Billit BVBA, C=BE; serial#: 71369940398021102737504673733320946005
certpath: RevocationChecker.check: checking cert
  SN:     35b159bc 9b742bea 656f4b4d a9b4b155
  Subject: CN=PBE000076, OU=PEPPOL PRODUCTION AP, O=Billit BVBA, C=BE
  Issuer: CN=PEPPOL ACCESS POINT CA - G2, O=OpenPEPPOL AISBL, C=BE
certpath: connecting to OCSP service at: http://pki-ocsp.symauth.com
certpath: RevocationChecker.check() Unable to determine revocation status due to network error
certpath: RevocationChecker.check() preparing to failover
certpath: RevocationChecker.checkCRLs() ---checking revocation status ...
certpath: RevocationChecker.checkCRLs() possible crls.size() = 0
certpath: RevocationChecker.checkCRLs() approved crls.size() = 0

On Linux this happens quite often, on my Windows dev machine this is really hard to reproduce. Both run Java 17.0.8 Linux:

java version "17.0.8" 2023-07-18 LTS
Java(TM) SE Runtime Environment (build 17.0.8+9-LTS-211)
Java HotSpot(TM) 64-Bit Server VM (build 17.0.8+9-LTS-211, mixed mode, sharing)

Windows:

openjdk version "17.0.8" 2023-07-18
OpenJDK Runtime Environment Temurin-17.0.8+7 (build 17.0.8+7)
OpenJDK 64-Bit Server VM Temurin-17.0.8+7 (build 17.0.8+7, mixed mode, sharing)

The resulting Java exception on Linux looks like this:

sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target
    at sun.security.provider.certpath.SunCertPathBuilder.build(SunCertPathBuilder.java:146) ~[?:?]
    at sun.security.provider.certpath.SunCertPathBuilder.engineBuild(SunCertPathBuilder.java:127) ~[?:?]
    at java.security.cert.CertPathBuilder.build(CertPathBuilder.java:297) ~[?:?]
    at com.helger.peppol.utils.CertificateRevocationChecker$AbstractRevocationCheckBuilder.lambda$build$1(CertificateRevocationChecker.java:679) ~[peppol-commons-9.0.7.jar:9.0.7]
    at com.helger.peppol.utils.CertificateRevocationChecker$AbstractRevocationCheckBuilder.build(CertificateRevocationChecker.java:703) ~[peppol-commons-9.0.7.jar:9.0.7]
    at com.helger.peppol.utils.PeppolCertificateChecker.checkCertificate(PeppolCertificateChecker.java:477) ~[peppol-commons-9.0.7.jar:9.0.7]
    at com.helger.peppol.utils.PeppolCertificateChecker.checkPeppolAPCertificate(PeppolCertificateChecker.java:515) ~[peppol-commons-9.0.7.jar:9.0.7]

The respective method in OCSP.java that creates the "unexpected network errror" is this one: grafik

phax commented 1 year ago

I really digged deep into this issue, with -Djava.security.debug=certpath.ocsp and a custom SystemLogger to debug the HTTP connection to the OCSP server, However, the version deployed on my server behaves in these detailed aspects very different compared to my local version and I can't explain it, or get the tracing data I need to figure out the HTTP error.

So I don't think this is a general problem, but a problem of my specific setup :(

Florianisme commented 1 year ago

I've built a minimal example project using your Revocation checker to try to reproduce the issue but I can't get it to fail locally at all.. I'm basically running the OCSP check in a loop it it always succeeds. I'm trying on Windows and on Linux with Java 17.0.8 (the patched version) and 17.0.6 (unpatched).

Florianisme commented 1 year ago

This was the only reason it gave me when I was able to reproduce it:

Caused by: java.security.cert.CertPathValidatorException: Unable to determine revocation status due to network error
    at java.base/sun.security.provider.certpath.PKIXMasterCertPathValidator.validate(PKIXMasterCertPathValidator.java:135)
    at java.base/sun.security.provider.certpath.PKIXCertPathValidator.validate(PKIXCertPathValidator.java:224)
    at java.base/sun.security.provider.certpath.PKIXCertPathValidator.validate(PKIXCertPathValidator.java:144)
    at java.base/sun.security.provider.certpath.PKIXCertPathValidator.engineValidate(PKIXCertPathValidator.java:83)
    at java.base/java.security.cert.CertPathValidator.validate(CertPathValidator.java:309)
    at Main2$1.run(Main2.java:54)
    ... 1 more
Caused by: java.net.ConnectException: Connection timed out: connect
    at java.base/sun.nio.ch.Net.connect0(Native Method)
    at java.base/sun.nio.ch.Net.connect(Net.java:579)
    at java.base/sun.nio.ch.Net.connect(Net.java:568)
    at java.base/sun.nio.ch.NioSocketImpl.connect(NioSocketImpl.java:593)
    at java.base/java.net.Socket.connect(Socket.java:633)
    at java.base/java.net.Socket.connect(Socket.java:583)
    at java.base/sun.net.NetworkClient.doConnect(NetworkClient.java:183)
    at java.base/sun.net.www.http.HttpClient.openServer(HttpClient.java:533)
    at java.base/sun.net.www.http.HttpClient.openServer(HttpClient.java:638)
    at java.base/sun.net.www.http.HttpClient.<init>(HttpClient.java:281)
    at java.base/sun.net.www.http.HttpClient.New(HttpClient.java:386)
    at java.base/sun.net.www.http.HttpClient.New(HttpClient.java:408)
    at java.base/sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:1309)
    at java.base/sun.net.www.protocol.http.HttpURLConnection.plainConnect0(HttpURLConnection.java:1242)
    at java.base/sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:1128)
    at java.base/sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:1057)
    at java.base/sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1665)
    at java.base/sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1589)
    at java.base/java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:529)
    at java.base/sun.security.provider.certpath.OCSP.getOCSPBytes(OCSP.java:273)
    at java.base/sun.security.provider.certpath.OCSP.check(OCSP.java:195)
    at java.base/sun.security.provider.certpath.RevocationChecker.checkOCSP(RevocationChecker.java:785)
    at java.base/sun.security.provider.certpath.RevocationChecker.check(RevocationChecker.java:369)
    at java.base/sun.security.provider.certpath.RevocationChecker.check(RevocationChecker.java:343)
    at java.base/sun.security.provider.certpath.PKIXMasterCertPathValidator.validate(PKIXMasterCertPathValidator.java:125)
    ... 6 more

Note that for my tests it was using the top branch in in OCSP:getOcspBytes of this if statement where it neither set's a connection timeout nor a read timeout: image

phax commented 1 year ago

Great thank you. Yes, the upper branch just seems to be right, as the request is only 80-90 bytes or so. Thanks.

Florianisme commented 1 year ago

~~What do you think? Is this Bug-worthy? I mean, it's an inconsistency which could (and apparently does) lead to issues when the request is smaller than the specified threshold.~~

I get why the read timeout might be set differently based on how large the request is but that should not affect the connect timeout.

Update: I have opened a bug ticket, looks like this was also introduced in https://github.com/openjdk/jdk/commit/f5ee356540d7aa4a7663c0d5d74f5fdb0726b426#diff-905a61d1dae19533a61cb5dd302071479775870df89d3a18e6db2537b007c803

Florianisme commented 1 year ago

Has been accepted https://bugs.java.com/bugdatabase/view_bug?bug_id=JDK-8315979

phax commented 1 year ago

Good catch. I never thought that the introduction of GET may also be cause of the problems. Seems like they also do trial and error in a way ;-) So going back to 17.0.3 could resolve the error.

Direct link to the bug tracker: https://bugs.openjdk.org/browse/JDK-8315979

Florianisme commented 1 year ago

Issue has been closed as a duplicate...

Florianisme commented 1 year ago

Are you able to update to Java 21 with your applications and give it another try? Apparently the discrepancy in timeout handling has been fixed there

phax commented 1 year ago

No, unfortunately not. I am more in favour of reverting to 17.0.3 for now. Or even Java 11 :D - let see if Java 17 gets some change in the next release 17.0.9 ...

stale[bot] commented 10 months ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

phax commented 10 months ago

The initial PR https://github.com/openjdk/jdk17u-dev/pull/1361 was never merged. Therefore the problem still exists in Java 17.0.8, 17.0.9, 17.0.10, 21.0.0 and 21.0.1 as well. Java 11.0.21 does NOT seem to be affected.

phax commented 10 months ago

Java 17b10: https://github.com/openjdk/jdk/blob/jdk-17%2B10/src/java.base/share/classes/sun/security/provider/certpath/OCSP.java#L271-L277 Java 21b2: https://github.com/openjdk/jdk/blob/jdk-21%2B2/src/java.base/share/classes/sun/security/provider/certpath/OCSP.java#L216-L222

Proposed fix in the PR: https://github.com/openjdk/jdk17u-dev/pull/1361/files#diff-905a61d1dae19533a61cb5dd302071479775870df89d3a18e6db2537b007c803R284-R286

phax commented 1 month ago

As Lets Encrypt and others are phasing out OCSP in favour of CRL, I will not pursue this issue any further (for now): https://letsencrypt.org/2024/07/23/replacing-ocsp-with-crls.html