eclipse-openj9 / openj9

Eclipse OpenJ9: A Java Virtual Machine for OpenJDK that's optimized for small footprint, fast start-up, and high throughput. Builds on Eclipse OMR (https://github.com/eclipse/omr) and combines with the Extensions for OpenJDK for OpenJ9 repo.
Other
3.27k stars 721 forks source link

jdk_security3_0_FAILED javax/net/ssl/ServerName/SSLEngineExplorerMatchedSNI.java - SSLException: Input record too big / Tag mismatch! #15862

Open JasonFengJ9 opened 2 years ago

JasonFengJ9 commented 2 years ago

Failure link

From an internal build(ubu22s390x-rt1-1):

openjdk version "1.8.0_352"
IBM Semeru Runtime Open Edition (build 1.8.0_352-b05)
Eclipse OpenJ9 VM (build v0.35.0-release-7dd982ba7, JRE 1.8.0 Linux s390x-64-Bit Compressed References 20220908_443 (JIT enabled, AOT enabled)
OpenJ9   - 7dd982ba7
OMR      - 938f0686f
JCL      - d0e3108595 based on jdk8u352-b05)

Rerun in Grinder - Change TARGET to run only the failed test targets.

Optional info

Failure output (captured from console output)

[2022-09-09T00:22:52.627Z] variation: Mode150
[2022-09-09T00:22:52.627Z] JVM_OPTIONS:  -XX:+UseCompressedOops 

[2022-09-09T00:24:21.176Z] TEST: javax/net/ssl/ServerName/SSLEngineExplorerMatchedSNI.java

[2022-09-09T00:24:21.185Z] STDERR:
[2022-09-09T00:24:21.185Z] javax.net.ssl.SSLProtocolException: Input record too big: max = 16709 len = 22945
[2022-09-09T00:24:21.185Z]  at sun.security.ssl.SSLEngineImpl.readRecord(SSLEngineImpl.java:508)
[2022-09-09T00:24:21.185Z]  at sun.security.ssl.SSLEngineImpl.unwrap(SSLEngineImpl.java:411)
[2022-09-09T00:24:21.185Z]  at sun.security.ssl.SSLEngineImpl.unwrap(SSLEngineImpl.java:390)
[2022-09-09T00:24:21.185Z]  at javax.net.ssl.SSLEngine.unwrap(SSLEngine.java:626)
[2022-09-09T00:24:21.185Z]  at SSLEngineService.receive(SSLEngineService.java:167)
[2022-09-09T00:24:21.185Z]  at SSLEngineExplorerMatchedSNI.doServerSide(SSLEngineExplorerMatchedSNI.java:160)
[2022-09-09T00:24:21.185Z]  at SSLEngineExplorerMatchedSNI.startServer(SSLEngineExplorerMatchedSNI.java:369)
[2022-09-09T00:24:21.185Z]  at SSLEngineExplorerMatchedSNI.<init>(SSLEngineExplorerMatchedSNI.java:319)
[2022-09-09T00:24:21.185Z]  at SSLEngineExplorerMatchedSNI.main(SSLEngineExplorerMatchedSNI.java:300)
[2022-09-09T00:24:21.185Z]  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
[2022-09-09T00:24:21.185Z]  at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
[2022-09-09T00:24:21.185Z]  at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
[2022-09-09T00:24:21.185Z]  at java.lang.reflect.Method.invoke(Method.java:498)
[2022-09-09T00:24:21.185Z]  at com.sun.javatest.regtest.agent.MainWrapper$MainThread.run(MainWrapper.java:127)
[2022-09-09T00:24:21.185Z]  at java.lang.Thread.run(Thread.java:826)
[2022-09-09T00:24:21.185Z] 
[2022-09-09T00:24:21.185Z] JavaTest Message: Test threw exception: javax.net.ssl.SSLProtocolException: Input record too big: max = 16709 len = 22945

[2022-09-09T01:18:46.725Z] Test results: passed: 601; failed: 1
[2022-09-09T01:18:46.725Z] Report written to /home/jenkins/workspace/Test_openjdk8_j9_extended.openjdk_s390x_linux/aqa-tests/TKG/output_16626783491180/jdk_security3_0/report/html/report.html
[2022-09-09T01:18:46.725Z] Results written to /home/jenkins/workspace/Test_openjdk8_j9_extended.openjdk_s390x_linux/aqa-tests/TKG/output_16626783491180/jdk_security3_0/work
[2022-09-09T01:18:46.725Z] Error: Some tests failed or other problems occurred.
[2022-09-09T01:18:46.725Z] 
[2022-09-09T01:18:46.725Z] jdk_security3_0_FAILED

50x internal grinder - failed 7 / 150, failing machines: ubu22s390x-rt5-1, rhel7s390x-4-1

pshipton commented 2 years ago

Try a grinder (100x) on 0.33.1 - passed Try more grinders (200x) on 0.33.1 - passed Try grinder (200x) on 0.35 with -Djdk.nativeCrypto=false - passed Try Temurin grinder (60x) - passed on machines where it ran

Another Grinder (200x) on 0.35 with -Djdk.nativeCrypto=false - failed 4 times on rhel9s390xrt-1 (one is a "Tag mismatch!" failure) Another Grinder (200x) on 0.35 - 15 failures, on: rhel7s390x-3-5, rhel7s390x-3-6, rhel7s390x-3-9, rhel7s390x-4-2, rhel7s390x-4-3, rhel7s390x-4-6, ubu22s390x-rt1-1

In the last two grinders there was also a different failure on the same test (rhel7s390x-4-3, rhel7s390x-4-5, rhel7s390x-4-8, rhel9s390xrt-1).

11:18:16  STDOUT:
11:18:16  resize destination buffer upto 16709 bytes for BUFFER_OVERFLOW
11:18:16  Record version: TLSv1.2
11:18:16  Hello version: TLSv1.2
11:18:16  resize source buffer upto 16709 bytes for BUFFER_UNDERFLOW
11:18:16  resize destination buffer upto 16709 bytes for BUFFER_OVERFLOW
11:18:16  resize destination buffer upto 16709 bytes for BUFFER_OVERFLOW
11:18:16  resize source buffer upto 16709 bytes for BUFFER_UNDERFLOW
11:18:16  STDERR:
11:18:16  javax.net.ssl.SSLException: Tag mismatch!
11:18:16    at sun.security.ssl.Alert.createSSLException(Alert.java:133)
11:18:16    at sun.security.ssl.TransportContext.fatal(TransportContext.java:324)
11:18:16    at sun.security.ssl.TransportContext.fatal(TransportContext.java:267)
11:18:16    at sun.security.ssl.TransportContext.fatal(TransportContext.java:262)
11:18:16    at sun.security.ssl.SSLTransport.decode(SSLTransport.java:119)
11:18:16    at sun.security.ssl.SSLEngineImpl.decode(SSLEngineImpl.java:588)
11:18:16    at sun.security.ssl.SSLEngineImpl.readRecord(SSLEngineImpl.java:544)
11:18:16    at sun.security.ssl.SSLEngineImpl.unwrap(SSLEngineImpl.java:411)
11:18:16    at sun.security.ssl.SSLEngineImpl.unwrap(SSLEngineImpl.java:390)
11:18:16    at javax.net.ssl.SSLEngine.unwrap(SSLEngine.java:626)
11:18:16    at SSLEngineService.receive(SSLEngineService.java:167)
11:18:16    at SSLEngineExplorerMatchedSNI.doServerSide(SSLEngineExplorerMatchedSNI.java:160)
11:18:16    at SSLEngineExplorerMatchedSNI.startServer(SSLEngineExplorerMatchedSNI.java:369)
11:18:16    at SSLEngineExplorerMatchedSNI.<init>(SSLEngineExplorerMatchedSNI.java:319)
11:18:16    at SSLEngineExplorerMatchedSNI.main(SSLEngineExplorerMatchedSNI.java:300)
11:18:16    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
11:18:16    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
11:18:16    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
11:18:16    at java.lang.reflect.Method.invoke(Method.java:498)
11:18:16    at com.sun.javatest.regtest.agent.MainWrapper$MainThread.run(MainWrapper.java:127)
11:18:16    at java.lang.Thread.run(Thread.java:826)
11:18:16  Caused by: javax.crypto.AEADBadTagException: Tag mismatch!
11:18:16    at com.sun.crypto.provider.NativeGaloisCounterMode.decryptFinal(NativeGaloisCounterMode.java:575)
11:18:16    at com.sun.crypto.provider.CipherCore.finalNoPadding(CipherCore.java:1173)
11:18:16    at com.sun.crypto.provider.CipherCore.fillOutputBuffer(CipherCore.java:1110)
11:18:16    at com.sun.crypto.provider.CipherCore.doFinal(CipherCore.java:998)
11:18:16    at com.sun.crypto.provider.AESCipher.engineDoFinal(AESCipher.java:491)
11:18:16    at javax.crypto.CipherSpi.bufferCrypt(CipherSpi.java:779)
11:18:16    at javax.crypto.CipherSpi.engineDoFinal(CipherSpi.java:730)
11:18:16    at javax.crypto.Cipher.doFinal(Cipher.java:2463)
11:18:16    at sun.security.ssl.SSLCipher$T13GcmReadCipherGenerator$GcmReadCipher.decrypt(SSLCipher.java:1880)
11:18:16    at sun.security.ssl.SSLEngineInputRecord.decodeInputRecord(SSLEngineInputRecord.java:240)
11:18:16    at sun.security.ssl.SSLEngineInputRecord.decode(SSLEngineInputRecord.java:197)
11:18:16    at sun.security.ssl.SSLEngineInputRecord.decode(SSLEngineInputRecord.java:160)
11:18:16    at sun.security.ssl.SSLTransport.decode(SSLTransport.java:109)
11:18:16    ... 16 more
pshipton commented 2 years ago

Subset of OpenJDK changes (e361c66..d0e3108):

11cd91f951d 8292688: Support Security properties in security.testlibrary.Proc
96bed26602e 8049228: Improve multithreaded scalability of InetAddress cache 7186258: InetAddress$Cache should replace currentTimeMillis with nanoTime for more precise and accurate
a87e397aae6 8232950: SUNPKCS11 Provider incorrectly check key length for PSS Signatures.
78e6dc24439 8039955: [TESTBUG] jdk/lambda/LambdaTranslationTest1 - java.lang.AssertionError: expected [d:1234.000000] but found [d:1234,000000]
9436b68a1e7 8201793: (ref) Reference object should not support cloning
21f65dd1289 8183107: PKCS11 regression regarding checkKeySize
9376803a7c0 8175797: (ref) Reference::enqueue method should clear the reference object before enqueuing 8178832: (ref) jdk.lang.ref.disableClearBeforeEnqueue property is ignored 8193780: (ref) Remove the undocumented "jdk.lang.ref.disableClearBeforeEnqueue" system property
76ed082a40b 8214427: probable bug in logic of ConcurrentHashMap.addCount()
96f0ca66d50 8288763: Pack200 extraction failure with invalid size
3642cf4da59 8245263: Enable TLSv1.3 by default on JDK 8u for Client roles
b4fa0fcc2ad 8071507: (ref) Clear phantom reference as soft and weak references do 8143847: Remove REF_CLEANER reference category
4db73b9fc11 8287132: Retire Runtime.runFinalizersOnExit so that it always throws UOE
bd987716b92 8194873: right ALT key hotkeys no longer work in Swing components 8155742: [Windows] robot.keyPress(KeyEvent.VK_ALT_GRAPH) throws java.lang.IllegalArgumentException in windows
612208ee2ba 8282538: PKCS11 tests fail on CentOS Stream 9
11cf113b007 8136354: [TEST_BUG] Test  java/awt/image/RescaleOp/RescaleAlphaTest.java with Bad action for script
9caefe574eb 8087283: Add support for the XML Signature here() function to the JDK XPath implementation
bb724a66da3 8260589: Crash in JfrTraceIdLoadBarrier::load(_jclass*)

grinder (200x) on xlinux 0.35 - passed another grinder (200x) on xlinux 0.35 - 1 failure (cent7x64-1-6), plus some BindException (infra) failures grinder (200x) on plinux 0.35 - passed Another Termurin grinder (200x) - passed Another 0.33.1 grinder (200x) - passed

pshipton commented 2 years ago

@joransiu this seems like a JIT issue introduced since the last release.

joransiu commented 2 years ago

@r30shah has kindly offered to look into this.

pshipton commented 2 years ago

I was able to get the same failure on xlinux, so this is not zlinux specific.

r30shah commented 2 years ago

Thanks a lot @pshipton for checking if the issue is Z specific or not. While I am checking what causes the failure (I tried running this manually on Ubuntu - 20.04, but can not reproduce in 5 runs), do you still suspect it is JIT related issue or any OpenJDK change listed in https://github.com/eclipse-openj9/openj9/issues/15862#issuecomment-1244201626 could have caused it?

pshipton commented 2 years ago

I'm not familiar with the OpenJDK code to know if one of the changes could be causing this. Maybe @alon-sh would be able to make an assessment. It feels like a JIT issue, but I could be wrong. I'm still running grinders, so far I haven't been able to reproduce outside of OpenJ9 0.35. The "same" OpenJDK code is used for Temurin, and I haven't reproduced it there yet.

pshipton commented 2 years ago

Grinders are finished (enough), no new failures on Temurin or 0.33. I'm done running grinders unless there is a suggestion for other options to try.

r30shah commented 1 year ago

I was able to extract out the test to run easily through Java command line. Uploading the test here for reference. reprduceFailure.zip

Looking at the unit test code, I do not see anything specific that stands out. Exception is thrown at [1] where the packet size exceeds the max buffer size fixed for SSLRecord.

I am able to get it reproduced with JIT disabled as well.

while [ $count -lt 100 ]; do ./jdk8u352-b05/bin/java -Xshareclasses:destroyAll -version ; ./jdk8u352-b05/bin/java  -Djdk.nativeCrypto=false -ea -esa -Xmx512m -XX:+UseCompressedOops -Xint SSLEngineExplorerMatchedSNI example.com "(.*\.)*example\.(com|org)"; count=$((count+1)); done
...
Attempting to destroy all caches in cacheDir /home/jenkins/javasharedresources/

JVMSHRC806I Compressed references persistent shared cache "sharedcc_jenkins" has been destroyed. Use option -Xnocompressedrefs if you want to destroy a non-compressed references cache.
resize destination buffer upto 16709 bytes for BUFFER_OVERFLOW
resize source buffer upto 16709 bytes for BUFFER_UNDERFLOW
Record version: TLSv1.2
Hello version: TLSv1.2
resize destination buffer upto 16709 bytes for BUFFER_OVERFLOW
resize destination buffer upto 16709 bytes for BUFFER_OVERFLOW
resize destination buffer upto 16676 bytes for BUFFER_OVERFLOW
resize source buffer upto 16709 bytes for BUFFER_UNDERFLOW
received 16367 bytes client application data
    comsumed 16405 byes network data
received 16367 bytes client application data
    comsumed 16405 byes network data
received 16367 bytes client application data
    comsumed 16405 byes network data
resize source buffer upto 16709 bytes for BUFFER_UNDERFLOW
received 16367 bytes client application data
    comsumed 16405 byes network data
received 1236 bytes client application data
    comsumed 1274 byes network data
resize destination buffer upto 16709 bytes for BUFFER_OVERFLOW
resize destination buffer upto 16676 bytes for BUFFER_OVERFLOW
received 16367 bytes client application data
    comsumed 16405 byes network data
received 16367 bytes client application data
    comsumed 16405 byes network data
received 16367 bytes client application data
    comsumed 16405 byes network data
received 16367 bytes client application data
    comsumed 16405 byes network data
received 1236 bytes client application data
    comsumed 1274 byes network data

Attempting to destroy all caches in cacheDir /home/jenkins/javasharedresources/

JVMSHRC806I Compressed references persistent shared cache "sharedcc_jenkins" has been destroyed. Use option -Xnocompressedrefs if you want to destroy a non-compressed references cache.
resize destination buffer upto 16709 bytes for BUFFER_OVERFLOW
resize source buffer upto 16709 bytes for BUFFER_UNDERFLOW
Record version: TLSv1.2
Hello version: TLSv1.2
resize destination buffer upto 16709 bytes for BUFFER_OVERFLOW
resize destination buffer upto 16709 bytes for BUFFER_OVERFLOW
resize source buffer upto 16709 bytes for BUFFER_UNDERFLOW
Exception in thread "main" javax.net.ssl.SSLProtocolException: Input record too big: max = 16709 len = 30968
    at sun.security.ssl.SSLEngineImpl.readRecord(SSLEngineImpl.java:508)
    at sun.security.ssl.SSLEngineImpl.unwrap(SSLEngineImpl.java:411)
    at sun.security.ssl.SSLEngineImpl.unwrap(SSLEngineImpl.java:390)
    at javax.net.ssl.SSLEngine.unwrap(SSLEngine.java:626)
    at SSLEngineService.receive(SSLEngineService.java:167)
    at SSLEngineExplorerMatchedSNI.doServerSide(SSLEngineExplorerMatchedSNI.java:160)
    at SSLEngineExplorerMatchedSNI.startServer(SSLEngineExplorerMatchedSNI.java:369)
    at SSLEngineExplorerMatchedSNI.<init>(SSLEngineExplorerMatchedSNI.java:319)
    at SSLEngineExplorerMatchedSNI.main(SSLEngineExplorerMatchedSNI.java:300)

As @pshipton noted in last comment, it does not fail with Temurin or 0.33, it might be OpenJ9 specific. Checking is it passes with 0.33 with the unit test.

[1]. https://github.com/ibmruntimes/openj9-openjdk-jdk8/blob/8e8d4d3ae262f8d653e6068f886fdae18b17e97e/jdk/src/share/classes/sun/security/ssl/SSLEngineImpl.java#L508-L510

alon-sh commented 1 year ago

I would say try -Djdk.nativeCrypto=false but that was already tried and still fails so its not the OpenSSL code.

At this point I don't have any further to add except to say that the TLS engine for both OpenJ9 and Temurin is identical.

pshipton commented 1 year ago

Both failures reproduce with -Xint -Xshareclasses:none -Xgcpolicy:optthruput or gencon.

I haven't been able to reproduce it with 0.33.1

I've created new JVMs as follows. Preliminary results indicate something changed in the openjdk code to cause these problems with OpenJ9.

I'll continue looking for suspect OpenJDK changes to try and isolate which change is causing the failures.

pshipton commented 1 year ago

https://github.com/ibmruntimes/openj9-openjdk-jdk8/compare/ca82562a8f7...fec4a2512db

Trying builds without these changes.

pshipton commented 1 year ago
pshipton commented 1 year ago

I created another build where the only change is reverting the change to sun/security/ssl/SSLContextImpl.java from 8245263: Enable TLSv1.3 by default on JDK 8u for Client roles (i.e. https://github.com/ibmruntimes/openj9-openjdk-jdk8/commit/cc2781c5a50) and this seems to fix the problem. Not sure what's wrong with our TLSv1.3 support. I wonder if something we changed to support OpenSSL could have broken it? Backing out the FIPS changes didn't fix it, so those changes are not the issue. I'll see if I can back out OpenSSL support changes to resolve the problem. Otherwise I'm not sure where to go from here. @alon-sh in case you have any ideas.

pshipton commented 1 year ago

I can duplicate "Input record too big" and "Tag mismatch!" failures on xlinux Temurin. This is not an OpenJ9 issue.

openjdk version "1.8.0_352-beta"
OpenJDK Runtime Environment (Temurin)(build 1.8.0_352-beta-202209121816-b05)
OpenJDK 64-Bit Server VM (Temurin)(build 25.352-b05, mixed mode)
resize destination buffer upto 16709 bytes for BUFFER_OVERFLOW
resize source buffer upto 16709 bytes for BUFFER_UNDERFLOW
Record version: TLSv1.2
Hello version: TLSv1.2
resize destination buffer upto 16709 bytes for BUFFER_OVERFLOW
resize destination buffer upto 16709 bytes for BUFFER_OVERFLOW
resize source buffer upto 16709 bytes for BUFFER_UNDERFLOW
Exception in thread "main" javax.net.ssl.SSLProtocolException: Input record too big: max = 16709 len = 57785
    at sun.security.ssl.SSLEngineImpl.readRecord(SSLEngineImpl.java:508)
    at sun.security.ssl.SSLEngineImpl.unwrap(SSLEngineImpl.java:411)
    at sun.security.ssl.SSLEngineImpl.unwrap(SSLEngineImpl.java:390)
    at javax.net.ssl.SSLEngine.unwrap(SSLEngine.java:626)
    at SSLEngineService.receive(SSLEngineService.java:167)
    at SSLEngineExplorerMatchedSNI.doServerSide(SSLEngineExplorerMatchedSNI.java:160)
    at SSLEngineExplorerMatchedSNI.startServer(SSLEngineExplorerMatchedSNI.java:369)
    at SSLEngineExplorerMatchedSNI.<init>(SSLEngineExplorerMatchedSNI.java:319)
    at SSLEngineExplorerMatchedSNI.main(SSLEngineExplorerMatchedSNI.java:300)
resize destination buffer upto 16709 bytes for BUFFER_OVERFLOW
resize source buffer upto 16709 bytes for BUFFER_UNDERFLOW
Record version: TLSv1.2
Hello version: TLSv1.2
resize destination buffer upto 16709 bytes for BUFFER_OVERFLOW
resize destination buffer upto 16709 bytes for BUFFER_OVERFLOW
resize source buffer upto 16709 bytes for BUFFER_UNDERFLOW
Exception in thread "main" javax.net.ssl.SSLException: Tag mismatch!
    at sun.security.ssl.Alert.createSSLException(Alert.java:133)
    at sun.security.ssl.TransportContext.fatal(TransportContext.java:324)
    at sun.security.ssl.TransportContext.fatal(TransportContext.java:267)
    at sun.security.ssl.TransportContext.fatal(TransportContext.java:262)
    at sun.security.ssl.SSLTransport.decode(SSLTransport.java:119)
    at sun.security.ssl.SSLEngineImpl.decode(SSLEngineImpl.java:588)
    at sun.security.ssl.SSLEngineImpl.readRecord(SSLEngineImpl.java:544)
    at sun.security.ssl.SSLEngineImpl.unwrap(SSLEngineImpl.java:411)
    at sun.security.ssl.SSLEngineImpl.unwrap(SSLEngineImpl.java:390)
    at javax.net.ssl.SSLEngine.unwrap(SSLEngine.java:626)
    at SSLEngineService.receive(SSLEngineService.java:167)
    at SSLEngineExplorerMatchedSNI.doServerSide(SSLEngineExplorerMatchedSNI.java:160)
    at SSLEngineExplorerMatchedSNI.startServer(SSLEngineExplorerMatchedSNI.java:369)
    at SSLEngineExplorerMatchedSNI.<init>(SSLEngineExplorerMatchedSNI.java:319)
    at SSLEngineExplorerMatchedSNI.main(SSLEngineExplorerMatchedSNI.java:300)
Caused by: javax.crypto.AEADBadTagException: Tag mismatch!
    at com.sun.crypto.provider.GaloisCounterMode.decryptFinal(GaloisCounterMode.java:620)
    at com.sun.crypto.provider.CipherCore.finalNoPadding(CipherCore.java:1116)
    at com.sun.crypto.provider.CipherCore.fillOutputBuffer(CipherCore.java:1053)
    at com.sun.crypto.provider.CipherCore.doFinal(CipherCore.java:941)
    at com.sun.crypto.provider.AESCipher.engineDoFinal(AESCipher.java:491)
    at javax.crypto.CipherSpi.bufferCrypt(CipherSpi.java:779)
    at javax.crypto.CipherSpi.engineDoFinal(CipherSpi.java:730)
    at javax.crypto.Cipher.doFinal(Cipher.java:2463)
    at sun.security.ssl.SSLCipher$T13GcmReadCipherGenerator$GcmReadCipher.decrypt(SSLCipher.java:1880)
    at sun.security.ssl.SSLEngineInputRecord.decodeInputRecord(SSLEngineInputRecord.java:240)
    at sun.security.ssl.SSLEngineInputRecord.decode(SSLEngineInputRecord.java:197)
    at sun.security.ssl.SSLEngineInputRecord.decode(SSLEngineInputRecord.java:160)
    at sun.security.ssl.SSLTransport.decode(SSLTransport.java:109)
    ... 10 more