Closed tam512 closed 1 year ago
@tam512 are both of these issues intermittent?
I can recreate them every time
Does the Create checkpoints on AMD Ubuntu 20.04, restore the image on Intel RHEL9, it fails with the following errors in restore.log
issue occur if both machines are intel?
Yes, the issue occur when both vms are AMD. The following is restore.log on AMD RHEL
# cat /logs/checkpoint/restore.log
Warn (criu/kerndat.c:1103): $XDG_RUNTIME_DIR not set. Cannot find location for kerndat file
Warn (criu/kerndat.c:1103): $XDG_RUNTIME_DIR not set. Cannot find location for kerndat file
126: Error (criu/sockets.c:461): sockets: Can't set 1:72 (len 4): Protocol not available
126: Error (criu/files.c:1213): Unable to open fd=15 id=0x49
Error (criu/cr-restore.c:2531): Restoring FAILED.
@tajila I'm sorry I meant to say that the issue occur when both VMs are AMD (we have only 1 Intel VM) when checkpoint on Ubuntu and restore on RHEL
There are 2 scenarios that I see checkpoints/restore working so far are:
@tam512 were you able to capture any diagnostic files? like the /opt/ol/wlp/output/defaultServer/core.20221115.004813.126.0002.dmp
?
@tajila yes I have the core files corefiles.zip
To work around the crash I recommend trying:
-Djdk.nativeDigest=false
If that is not successful
-Djdk.nativePBE=false
then -Djdk.nativeCrypto=false
Okay looking at the corefile, I believe -Djdk.nativeDigest=false
should work. Please let me know the results @tam512
Update:
We are seeing crashes in Digest, PBE and RSA
3XMTHREADINFO3 Java callstack:
4XESTACKTRACE at jdk/crypto/jniprovider/NativeCrypto.DigestComputeAndReset(Native Method)
4XESTACKTRACE at sun/security/provider/NativeDigest.engineDigest(NativeDigest.java:169)
5XESTACKTRACE (entered lock: sun/security/provider/NativeSHA@0x00000000FF6A3970, entry count: 1)
4XESTACKTRACE at sun/security/provider/NativeDigest.engineDigest(NativeDigest.java:147)
4XESTACKTRACE at java/security/MessageDigest$Delegate.engineDigest(MessageDigest.java:678(Compiled Code))
4XESTACKTRACE at java/security/MessageDigest.digest(MessageDigest.java:389(Compiled Code))
4XESTACKTRACE at java/security/MessageDigest.digest(MessageDigest.java:435)
4XESTACKTRACE at com/ibm/ws/crypto/ltpakeyutil/KeyEncryptor.<init>(KeyEncryptor.java:33)
4XESTACKTRACE at com/ibm/ws/security/token/ltpa/LTPAKeyInfoManager.prepareLTPAKeyInfo(LTPAKeyInfoManager.java:130)
5XESTACKTRACE (entered lock: com/ibm/ws/security/token/ltpa/LTPAKeyInfoManager@0x00000000FF63B140, entry count: 1)
4XESTACKTRACE at com/ibm/ws/security/token/ltpa/internal/LTPAKeyCreateTask.getPreparedLtpaKeyInfoManager(LTPAKeyCreateTask.java:53)
4XESTACKTRACE at com/ibm/ws/security/token/ltpa/internal/LTPAKeyCreateTask.createRequiredCollaborators(LTPAKeyCreateTask.java:89)
4XESTACKTRACE at com/ibm/ws/security/token/ltpa/internal/LTPAKeyCreateTask.run(LTPAKeyCreateTask.java:99)
5XESTACKTRACE (entered lock: com/ibm/ws/security/token/ltpa/internal/LTPAKeyCreateTask@0x0000000089572788, entry count: 1)
4XESTACKTRACE at com/ibm/ws/threading/internal/ExecutorServiceImpl$RunnableWrapper.run(ExecutorServiceImpl.java:245)
4XESTACKTRACE at java/util/concurrent/ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136(Compiled Code))
4XESTACKTRACE at java/util/concurrent/ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
4XESTACKTRACE at java/lang/Thread.run(Thread.java:857)
3XMHEAPALLOC Heap bytes allocated since last GC cycle=2916928 (0x2C8240)
3XMTHREADINFO3 Java callstack:
4XESTACKTRACE at jdk/crypto/jniprovider/NativeCrypto.PBEDerive(Native Method)
4XESTACKTRACE at com/sun/crypto/provider/PKCS12PBECipherCore.derive(PKCS12PBECipherCore.java:122)
4XESTACKTRACE at com/sun/crypto/provider/HmacPKCS12PBECore.engineInit(HmacPKCS12PBECore.java:179)
4XESTACKTRACE at javax/crypto/Mac.chooseProvider(Mac.java:365)
5XESTACKTRACE (entered lock: java/lang/Object@0x00000000FF77AD70, entry count: 1)
4XESTACKTRACE at javax/crypto/Mac.init(Mac.java:464)
4XESTACKTRACE at sun/security/pkcs12/PKCS12KeyStore.lambda$engineLoad$2(PKCS12KeyStore.java:2205)
4XESTACKTRACE at sun/security/pkcs12/PKCS12KeyStore$$Lambda$212/0x0000000000000000.tryOnce(Bytecode PC:25)
4XESTACKTRACE at sun/security/pkcs12/PKCS12KeyStore$RetryWithZero.run(PKCS12KeyStore.java:257)
5XESTACKTRACE (entered lock: sun/security/pkcs12/PKCS12KeyStore@0x00000000FF50AE50, entry count: 1)
4XESTACKTRACE at sun/security/pkcs12/PKCS12KeyStore.engineLoad(PKCS12KeyStore.java:2202)
4XESTACKTRACE at sun/security/util/KeyStoreDelegator.engineLoad(KeyStoreDelegator.java:221)
4XESTACKTRACE at java/security/KeyStore.load(KeyStore.java:1473)
4XESTACKTRACE at com/ibm/ws/ssl/config/WSKeyStore.defaultFileExists(WSKeyStore.java:369)
4XESTACKTRACE at com/ibm/ws/ssl/config/WSKeyStore.<init>(WSKeyStore.java:266)
5XESTACKTRACE (entered lock: com/ibm/ws/ssl/internal/KeystoreConfig@0x00000000FF2F5028, entry count: 1)
4XESTACKTRACE at com/ibm/ws/ssl/internal/KeystoreConfig.updateKeystoreConfig(KeystoreConfig.java:93)
4XESTACKTRACE at com/ibm/ws/ssl/internal/KeystoreConfigurationFactory.updated(KeystoreConfigurationFactory.java:113)
4XESTACKTRACE at com/ibm/ws/config/admin/internal/ManagedServiceFactoryTracker$2.run(ManagedServiceFactoryTracker.java:267)
4XESTACKTRACE at java/util/concurrent/Executors$RunnableAdapter.call(Executors.java:539(Compiled Code))
4XESTACKTRACE at java/util/concurrent/FutureTask.run(FutureTask.java:264(Compiled Code))
4XESTACKTRACE at com/ibm/ws/config/admin/internal/UpdateQueue$Queue.run(UpdateQueue.java:66)
4XESTACKTRACE at java/util/concurrent/Executors$RunnableAdapter.call(Executors.java:539(Compiled Code))
4XESTACKTRACE at java/util/concurrent/FutureTask.run(FutureTask.java:264(Compiled Code))
4XESTACKTRACE at java/util/concurrent/ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
4XESTACKTRACE at java/util/concurrent/ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136(Compiled Code))
4XESTACKTRACE at java/util/concurrent/ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
4XESTACKTRACE at java/lang/Thread.run(Thread.java:857)
3XMTHREADINFO3 Native callstack:
4XENATIVESTACK protectedBacktrace+0x12 (0x00007FEE63ABB0C2 [libj9prt29.so+0x260c2])
4XENATIVESTACK
3XMTHREADINFO3 Java callstack:
4XESTACKTRACE at jdk/crypto/jniprovider/NativeCrypto.RSAEP(Native Method)
4XESTACKTRACE at sun/security/rsa/NativeRSACore.crypt_Native(NativeRSACore.java:110)
4XESTACKTRACE at sun/security/rsa/NativeRSACore.rsa(NativeRSACore.java:81)
4XESTACKTRACE at sun/security/rsa/RSACore.rsa(RSACore.java:121)
4XESTACKTRACE at sun/security/rsa/RSASignature.engineVerify(RSASignature.java:216)
4XESTACKTRACE at java/security/Signature$Delegate.engineVerify(Signature.java:1435)
4XESTACKTRACE at java/security/Signature.verify(Signature.java:789)
4XESTACKTRACE at sun/security/x509/X509CertImpl.verify(X509CertImpl.java:443)
5XESTACKTRACE (entered lock: sun/security/x509/X509CertImpl@0x00000000FFA09EC0, entry count: 1)
4XESTACKTRACE at sun/security/x509/X509CertImpl.verify(X509CertImpl.java:382)
4XESTACKTRACE at sun/security/tools/KeyStoreUtil.signedBy(KeyStoreUtil.java:80)
4XESTACKTRACE at sun/security/tools/KeyStoreUtil.isSelfSigned(KeyStoreUtil.java:72)
4XESTACKTRACE at sun/security/pkcs12/PKCS12KeyStore.engineLoad(PKCS12KeyStore.java:2261)
5XESTACKTRACE (entered lock: sun/security/pkcs12/PKCS12KeyStore@0x00000000FF7720B0, entry count: 1)
4XESTACKTRACE at sun/security/util/KeyStoreDelegator.engineLoad(KeyStoreDelegator.java:221)
4XESTACKTRACE at java/security/KeyStore.load(KeyStore.java:1473)
4XESTACKTRACE at com/ibm/ws/ssl/config/WSKeyStore.defaultFileExists(WSKeyStore.java:369)
4XESTACKTRACE at com/ibm/ws/ssl/config/WSKeyStore.<init>(WSKeyStore.java:266)
4XESTACKTRACE at com/ibm/ws/ssl/internal/KeystoreConfig.updateKeystoreConfig(KeystoreConfig.java:93)
5XESTACKTRACE (entered lock: com/ibm/ws/ssl/internal/KeystoreConfig@0x00000000FF63D408, entry count: 1)
4XESTACKTRACE at com/ibm/ws/ssl/internal/KeystoreConfigurationFactory.updated(KeystoreConfigurationFactory.java:113)
4XESTACKTRACE at com/ibm/ws/config/admin/internal/ManagedServiceFactoryTracker$2.run(ManagedServiceFactoryTracker.java:267)
4XESTACKTRACE at java/util/concurrent/Executors$RunnableAdapter.call(Executors.java:539(Compiled Code))
4XESTACKTRACE at java/util/concurrent/FutureTask.run(FutureTask.java:264(Compiled Code))
4XESTACKTRACE at com/ibm/ws/config/admin/internal/UpdateQueue$Queue.run(UpdateQueue.java:66)
4XESTACKTRACE at java/util/concurrent/Executors$RunnableAdapter.call(Executors.java:539(Compiled Code))
4XESTACKTRACE at java/util/concurrent/FutureTask.run(FutureTask.java:264(Compiled Code))
4XESTACKTRACE at java/util/concurrent/ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
4XESTACKTRACE at java/util/concurrent/ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136(Compiled Code))
4XESTACKTRACE at java/util/concurrent/ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
4XESTACKTRACE at java/lang/Thread.run(Thread.java:857)
3XMTHREADINFO3 Native callstack:
4XENATIVESTACK protectedBacktrace+0x12 (0x00007F1EA64F20C2 [libj9prt29.so+0x260c2])
4XENATIVESTACK omrsig_protect+0x2b1 (0x00007F1EA64F67B1 [libj9prt29.so+0x2a7b1])
4XENATIVESTACK omr
It does not work with -Djdk.nativeDigest=false
. Here is the core file corefiles-disable-nativeDigest.zip
# podman logs microwebapp-restore-app-nonativedigest
[AUDIT ] CWWKZ0001I: Application microwebapp started in 0.125 seconds.
[AUDIT ] CWWKC0452I: The Liberty server process resumed operation from a checkpoint in 0.163 seconds.
[AUDIT ] CWWKF0012I: The server installed the following features: [appSecurity-4.0, cdi-3.0, checkpoint-1.0, concurrent-2.0, distributedMap-1.0, expressionLanguage-4.0, jndi-1.0, json-1.0, jsonb-2.0, jsonp-2.0, jwt-1.0, microProfile-5.0, monitor-1.0, mpConfig-3.0, mpFaultTolerance-4.0, mpHealth-4.0, mpJwt-2.0, mpMetrics-4.0, mpOpenAPI-3.0, mpOpenTracing-3.0, mpRestClient-3.0, pages-3.0, restfulWS-3.0, restfulWSClient-3.0, servlet-5.0, ssl-1.0, transportSecurity-1.0].
[AUDIT ] CWWKF0011I: The defaultServer server is ready to run a smarter planet. The defaultServer server started in 0.179 seconds.
Unhandled exception
Type=Illegal instruction vmState=0x00040000
J9Generic_Signal_Number=00000048 Signal_Number=00000004 Error_Value=00000000 Signal_Code=00000002
Handler1=00007FEE624E63F0 Handler2=00007FEE63ABEA80
RDI=00007FEDCC006B30 RSI=00007FEDCC0CEDB0 RAX=0000000000000000 RBX=00007FEDCC006B30
RCX=00007FEE21DF2D00 RDX=0000000000000001 R8=000000000000005A R9=00000000078BFFFF
R10=00000000FFF83203 R11=00000000209C01AB R12=00007FEDCC0CEDB0 R13=00000000FF794FD8
R14=0000000000000040 R15=0000000000000020
RIP=00007FEE21DF2FAB GS=0000 FS=0000 RSP=00007FEE21505328
EFlags=0000000000010206 CS=0033 RBP=0000000000000040 ERR=0000000000000000
TRAPNO=0000000000000006 OLDMASK=0000000000000000 CR2=0000000000000000
xmm0 743a4794458d329b (f: 1166881408.000000, d: 7.526191e+251)
xmm1 510e527f9b05688c (f: 2600823040.000000, d: 2.876275e+82)
xmm2 1f83d9ab5be0cd19 (f: 1541459200.000000, d: 7.229008e-157)
xmm3 0303030303030303 (f: 50529028.000000, d: 3.720974e-294)
xmm4 0303030303030303 (f: 50529028.000000, d: 3.720974e-294)
xmm5 0303030303030303 (f: 50529028.000000, d: 3.720974e-294)
xmm6 0303030303030303 (f: 50529028.000000, d: 3.720974e-294)
xmm7 0405060700010203 (f: 66051.000000, d: 2.696622e-289)
xmm8 0405060700010203 (f: 66051.000000, d: 2.696622e-289)
xmm9 090e00571a94e06f (f: 445964384.000000, d: 4.652157e-265)
xmm10 1f83d9ab5be0cd19 (f: 1541459200.000000, d: 7.229008e-157)
xmm11 73d8fc363c28039d (f: 1009255296.000000, d: 1.118045e+250)
xmm12 787ae9e36472412a (f: 1685209344.000000, d: 2.274931e+272)
xmm13 45f485bbc3e243bc (f: 3286385664.000000, d: 1.016220e+29)
xmm14 fb01797e74df54d7 (f: 1960793344.000000, d: -3.248126e+284)
xmm15 94a3ace2fa590627 (f: 4200138240.000000, d: -2.992370e-209)
Module=/lib64/libcrypto.so.1.1
Module_base_address=00007FEE21C16000
Target=2_90_20221018_61 (Linux 5.14.0-70.26.1.el9_0.x86_64)
CPU=amd64 (4 logical CPUs) (0x1e04a0000 RAM)
----------- Stack Backtrace -----------
(0x00007FEE21DF2FAB [libcrypto.so.1.1+0x1dcfab])
SHA256_Update+0x12d (0x00007FEE21DF669D [libcrypto.so.1.1+0x1e069d])
PKCS12_key_gen_uni+0x231 (0x00007FEE21DB9691 [libcrypto.so.1.1+0x1a3691])
Java_jdk_crypto_jniprovider_NativeCrypto_PBEDerive+0xd4 (0x00007FEE40036C84 [libjncrypto.so+0x6c84])
ffi_call_unix64+0x52 (0x00007FEE6266759A [libj9vm29.so+0x1e659a])
ffi_call_int+0x1a2 (0x00007FEE62666732 [libj9vm29.so+0x1e5732])
_ZN32VM_BytecodeInterpreterCompressed3runEP10J9VMThread+0x142b8 (0x00007FEE62552878 [libj9vm29.so+0xd1878])
bytecodeLoopCompressed+0xad (0x00007FEE6253E5AD [libj9vm29.so+0xbd5ad])
(0x00007FEE625F5852 [libj9vm29.so+0x174852])
---------------------------------------
JVMDUMP039I Processing dump event "gpf", detail "" at 2022/11/15 19:16:50 - please wait.
JVMDUMP032I JVM requested System dump using '/opt/ol/wlp/output/defaultServer/core.20221115.191650.126.0001.dmp' in response to an event
JVMPORT030W /proc/sys/kernel/core_pattern setting "|/usr/lib/systemd/systemd-coredump %P %u %g %s %t %c %h" specifies that the core dump is to be piped to an external program. Attempting to rename either core or core.42.
JVMDUMP012E Error in System dump: The core file created by child process with pid = 42 was not found. Expected to find core file with name "/opt/ol/wlp/output/defaultServer/core.42"
JVMDUMP027W The requested heapdump has not been produced because another component is holding the VM exclusive lock.
JVMDUMP032I JVM requested Java dump using '/opt/ol/wlp/output/defaultServer/javacore.20221115.191650.126.0003.txt' in response to an event
JVMDUMP010I Java dump written to /opt/ol/wlp/output/defaultServer/javacore.20221115.191650.126.0003.txt
JVMDUMP032I JVM requested Snap dump using '/opt/ol/wlp/output/defaultServer/Snap.20221115.191650.126.0004.trc' in response to an event
It also does not work with -Djdk.nativeDigest=false
and-Djdk.nativePBE=false
. Here is the core files corefiles-disable-nativeDigestandnativePBE.zip
# podman logs microwebapp-restore-app-nonativedigestpbe
[AUDIT ] CWWKZ0001I: Application microwebapp started in 0.127 seconds.
[AUDIT ] CWWKC0452I: The Liberty server process resumed operation from a checkpoint in 0.206 seconds.
[AUDIT ] CWWKF0012I: The server installed the following features: [appSecurity-4.0, cdi-3.0, checkpoint-1.0, concurrent-2.0, distributedMap-1.0, expressionLanguage-4.0, jndi-1.0, json-1.0, jsonb-2.0, jsonp-2.0, jwt-1.0, microProfile-5.0, monitor-1.0, mpConfig-3.0, mpFaultTolerance-4.0, mpHealth-4.0, mpJwt-2.0, mpMetrics-4.0, mpOpenAPI-3.0, mpOpenTracing-3.0, mpRestClient-3.0, pages-3.0, restfulWS-3.0, restfulWSClient-3.0, servlet-5.0, ssl-1.0, transportSecurity-1.0].
[AUDIT ] CWWKF0011I: The defaultServer server is ready to run a smarter planet. The defaultServer server started in 0.232 seconds.
Unhandled exception
Type=Illegal instruction vmState=0x00040000
J9Generic_Signal_Number=00000048 Signal_Number=00000004 Error_Value=00000000 Signal_Code=00000002
Handler1=00007F1EA4F1D3F0 Handler2=00007F1EA64F5A80
RDI=00007F1E10022938 RSI=00007F1E100D1320 RAX=00007F1E5CEF9208 RBX=00007F1E5CEF90E0
RCX=00007F1E100D0AD0 RDX=A9BD33CC5EA07563 R8=38DEAB6E4ECD27BD R9=A9BD33CC5EA07563
R10=00007F1E10022A30 R11=0000000000000000 R12=00007F1E100072F0 R13=00007F1E100158C0
R14=00007F1E10016CE8 R15=00007F1E10016CE0
RIP=00007F1E7C749F8F GS=0000 FS=0000 RSP=00007F1E5CEF9080
EFlags=0000000000010202 CS=0033 RBP=00007F1E5CEF9080 ERR=0000000000000000
TRAPNO=0000000000000006 OLDMASK=0000000000000000 CR2=0000000000000000
xmm0 0000000000000000 (f: 0.000000, d: 0.000000e+00)
xmm1 b0f230a563bcbacb (f: 1673313024.000000, d: -6.434493e-73)
xmm2 6f99d8e509b14954 (f: 162613584.000000, d: 3.918781e+229)
xmm3 c7918ae87494a723 (f: 1955899136.000000, d: -5.829532e+36)
xmm4 5c405644a531c206 (f: 2771501568.000000, d: 2.374871e+136)
xmm5 700293629f1670ab (f: 2669048064.000000, d: 3.604893e+231)
xmm6 a3fa3c01fba43e1d (f: 4221844992.000000, d: -2.255865e-135)
xmm7 2d49865fd796ac8c (f: 3616976128.000000, d: 1.566301e-90)
xmm8 adb4c6a36df9f5cf (f: 1845097984.000000, d: -1.631856e-88)
xmm9 bddfea1be4ad449a (f: 3836560640.000000, d: -1.161042e-10)
xmm10 70e78a7da989f381 (f: 2844390400.000000, d: 7.484989e+235)
xmm11 94ceec2dd814ad1e (f: 3625233664.000000, d: -1.881173e-208)
xmm12 f36170a64893cccf (f: 1217645824.000000, d: -6.096967e+247)
xmm13 d048eb13a06d80a9 (f: 2691531008.000000, d: -5.770677e+78)
xmm14 45b4c5d850291f63 (f: 1344872320.000000, d: 6.428880e+27)
xmm15 70e78a7da989f381 (f: 2844390400.000000, d: 7.484989e+235)
Module=/lib64/libcrypto.so.1.1
Module_base_address=00007F1E7C67F000
Target=2_90_20221018_61 (Linux 5.14.0-70.26.1.el9_0.x86_64)
CPU=amd64 (4 logical CPUs) (0x1e04a0000 RAM)
----------- Stack Backtrace -----------
(0x00007F1E7C749F8F [libcrypto.so.1.1+0xcaf8f])
(0x00007F1E7C73E053 [libcrypto.so.1.1+0xbf053])
BN_mod_exp_mont+0x165 (0x00007F1E7C739475 [libcrypto.so.1.1+0xba475])
(0x00007F1E7C83E5F3 [libcrypto.so.1.1+0x1bf5f3])
Java_jdk_crypto_jniprovider_NativeCrypto_RSAEP+0x6b (0x00007F1E84006BFB [libjncrypto.so+0x4bfb])
ffi_call_unix64+0x52 (0x00007F1EA509E59A [libj9vm29.so+0x1e659a])
ffi_call_int+0x1a2 (0x00007F1EA509D732 [libj9vm29.so+0x1e5732])
_ZN32VM_BytecodeInterpreterCompressed3runEP10J9VMThread+0x142b8 (0x00007F1EA4F89878 [libj9vm29.so+0xd1878])
bytecodeLoopCompressed+0xad (0x00007F1EA4F755AD [libj9vm29.so+0xbd5ad])
(0x00007F1EA502C852 [libj9vm29.so+0x174852])
---------------------------------------
JVMDUMP039I Processing dump event "gpf", detail "" at 2022/11/15 19:53:56 - please wait.
JVMDUMP032I JVM requested System dump using '/opt/ol/wlp/output/defaultServer/core.20221115.195356.126.0001.dmp' in response to an event
JVMPORT030W /proc/sys/kernel/core_pattern setting "|/usr/lib/systemd/systemd-coredump %P %u %g %s %t %c %h" specifies that the core dump is to be piped to an external program. Attempting to rename either core or core.45.
JVMDUMP012E Error in System dump: The core file created by child process with pid = 45 was not found. Expected to find core file with name "/opt/ol/wlp/output/defaultServer/core.45"
JVMDUMP027W The requested heapdump has not been produced because another component is holding the VM exclusive lock.
JVMDUMP032I JVM requested Java dump using '/opt/ol/wlp/output/defaultServer/javacore.20221115.195356.126.0003.txt' in response to an event
JVMDUMP010I Java dump written to /opt/ol/wlp/output/defaultServer/javacore.20221115.195356.126.0003.txt
JVMDUMP032I JVM requested Snap dump using '/opt/ol/wlp/output/defaultServer/Snap.20221115.195356.126.0004.trc' in response to an event
It works with -Djdk.nativeCrypto=false
# podman logs microwebapp-restore-app-nonativecrypto
[AUDIT ] CWWKZ0001I: Application microwebapp started in 0.112 seconds. [AUDIT ] CWWKC0452I: The Liberty server process resumed operation from a checkpoint in 0.162 seconds. [AUDIT ] CWWKF0012I: The server installed the following features: [appSecurity-4.0, cdi-3.0, checkpoint-1.0, concurrent-2.0, distributedMap-1.0, expressionLanguage-4.0, jndi-1.0, json-1.0, jsonb-2.0, jsonp-2.0, jwt-1.0, microProfile-5.0, monitor-1.0, mpConfig-3.0, mpFaultTolerance-4.0, mpHealth-4.0, mpJwt-2.0, mpMetrics-4.0, mpOpenAPI-3.0, mpOpenTracing-3.0, mpRestClient-3.0, pages-3.0, restfulWS-3.0, restfulWSClient-3.0, servlet-5.0, ssl-1.0, transportSecurity-1.0]. [AUDIT ] CWWKF0011I: The defaultServer server is ready to run a smarter planet. The defaultServer server started in 0.184 seconds.
Update:
To recreate this scenario, we deployed an application from the base image icr.io/appcafe/open-liberty:beta-instanton
with:
[root@vm902140 Beta2]# cat jvm.options
-verbose:gc
-Xdump:heap
-Xverbosegclog:/logs/verbosegc.log
[root@vm902140 Beta2]#
We are seeing some of the same "Permission denied" errors in restore.log
, but the application is functioning as expected.
bash-4.4$ cat restore.log
Error (criu/libnetlink.c:84): Can't send request message: Permission denied
Error (criu/libnetlink.c:84): Can't send request message: Permission denied
Error (criu/libnetlink.c:84): Can't send request message: Permission denied
Error (criu/libnetlink.c:84): Can't send request message: Permission denied
Error (criu/libnetlink.c:84): Can't send request message: Permission denied
Error (criu/libnetlink.c:84): Can't send request message: Permission denied
bash-4.4$ exit
This does not happen with the current beta and will not happen in the next beta. But it is not really fixed yet. The JVM backed out the problematic code change that caused this. But we still are waiting for the right fix from OpenJ9 here.
I also verified that I don't see this problem using InstantOn image cp.stg.icr.io/cp/olc/open-liberty@sha256:52e3cd77a365280a5fdcfa127140b45fc6afea016655c3dc98bc53f0ca46f281
and java version (I will verify again once we have a real fix)
# podman exec -it microwebapp-restore-app java --version
java 17.0.6-ea 2023-01-17
IBM Semeru Runtime Open Edition 17.0.6.0-december_22-preview_1 (build 17.0.6-ea+6)
Eclipse OpenJ9 VM 17.0.6.0-december_22-preview_1 (build criu-preview-december_22-release-6c898853679, JRE 17 Linux amd64-64-Bit Compressed References 20230117_95 (JIT enabled, AOT enabled)
OpenJ9 - 6c898853679
OMR - 6124dbc41
JCL - 220b5cb074a based on jdk-17.0.6+6)
Should this issue be closed now? I don't think the original reported issue can be reproduced on the latest releases.
I don't see the problem on latest releases. Close issue
Using InstantOn beta2 image with Java17
podman logs microwebapp-restore-app-amd-rhel9
[AUDIT ] CWWKZ0001I: Application microwebapp started in 0.133 seconds. Unhandled exception Type=Illegal instruction vmState=0x00040000 J9Generic_Signal_Number=00000048 Signal_Number=00000004 Error_Value=00000000 Signal_Code=00000002 Handler1=00007FEE7DAFC3F0 Handler2=00007FEE7F0D4A80 RDI=00007FEE24007350 RSI=00007FEE2400736C RAX=2800000000000000 RBX=00007FEE24007350 RCX=0000000000000020 RDX=0000000000000000 R8=00007FEE240073AC R9=00000000078BFFFF R10=00000000209C01AB R11=00007FEE5C0089E0 R12=00007FEE2400736C R13=00000000FF6F53D8 R14=0000000000B37F98 R15=00007FEE353FC788 RIP=00007FEE3D42AE04 GS=0000 FS=0000 RSP=00007FEE353FC3E8 EFlags=0000000000010246 CS=0033 RBP=00000000FF6F53D8 ERR=0000000000000000 TRAPNO=0000000000000006 OLDMASK=0000000000000000 CR2=0000000000000000 xmm0 98badcfe10325476 (f: 271733888.000000, d: -1.507306e-189) xmm1 0000000000000000 (f: 0.000000, d: 0.000000e+00) xmm2 00007fee7dcaa330 (f: 2110432000.000000, d: 6.949640e-310) xmm3 08090a0b0c0d0e0f (f: 202182160.000000, d: 5.924543e-270) xmm4 0000000000000000 (f: 0.000000, d: 0.000000e+00) xmm5 0000000000000000 (f: 0.000000, d: 0.000000e+00) xmm6 0000000000000000 (f: 0.000000, d: 0.000000e+00) xmm7 0000000000000028 (f: 40.000000, d: 1.976263e-322) xmm8 98badcfe10325476 (f: 271733888.000000, d: -1.507306e-189) xmm9 0000000000000000 (f: 0.000000, d: 0.000000e+00) xmm10 000000ff00000000 (f: 0.000000, d: 5.411089e-312) xmm11 0000000000000000 (f: 0.000000, d: 0.000000e+00) xmm12 0000000000000000 (f: 0.000000, d: 0.000000e+00) xmm13 0000000000000000 (f: 0.000000, d: 0.000000e+00) xmm14 0000000000000000 (f: 0.000000, d: 0.000000e+00) xmm15 0000000000000000 (f: 0.000000, d: 0.000000e+00) Module=/lib64/libcrypto.so.1.1 Module_base_address=00007FEE3D25B000 Target=2_90_20221018_61 (Linux 5.14.0-70.26.1.el9_0.x86_64) CPU=amd64 (4 logical CPUs) (0x1e04a0000 RAM) ----------- Stack Backtrace ----------- (0x00007FEE3D42AE04 [libcrypto.so.1.1+0x1cfe04]) SHA1_Final+0x70 (0x00007FEE3D42E950 [libcrypto.so.1.1+0x1d3950]) EVP_DigestFinal_ex+0x28 (0x00007FEE3D3B9808 [libcrypto.so.1.1+0x15e808]) Java_jdk_crypto_jniprovider_NativeCrypto_DigestComputeAndReset+0xc5 (0x00007FEE5C008AA5 [libjncrypto.so+0x3aa5]) ffi_call_unix64+0x52 (0x00007FEE7DC7D59A [libj9vm29.so+0x1e659a]) ffi_call_int+0x1a2 (0x00007FEE7DC7C732 [libj9vm29.so+0x1e5732]) _ZN32VM_BytecodeInterpreterCompressed3runEP10J9VMThread+0x142b8 (0x00007FEE7DB68878 [libj9vm29.so+0xd1878]) bytecodeLoopCompressed+0xad (0x00007FEE7DB545AD [libj9vm29.so+0xbd5ad]) (0x00007FEE7DC0B852 [libj9vm29.so+0x174852])
JVMDUMP039I Processing dump event "gpf", detail "" at 2022/11/15 00:48:13 - please wait. JVMDUMP032I JVM requested System dump using '/opt/ol/wlp/output/defaultServer/core.20221115.004813.126.0001.dmp' in response to an event [AUDIT ] CWWKC0452I: The Liberty server process resumed operation from a checkpoint in 0.214 seconds. [AUDIT ] CWWKF0012I: The server installed the following features: [appSecurity-4.0, cdi-3.0, checkpoint-1.0, concurrent-2.0, distributedMap-1.0, expressionLanguage-4.0, jndi-1.0, json-1.0, jsonb-2.0, jsonp-2.0, jwt-1.0, microProfile-5.0, monitor-1.0, mpConfig-3.0, mpFaultTolerance-4.0, mpHealth-4.0, mpJwt-2.0, mpMetrics-4.0, mpOpenAPI-3.0, mpOpenTracing-3.0, mpRestClient-3.0, pages-3.0, restfulWS-3.0, restfulWSClient-3.0, servlet-5.0, ssl-1.0, transportSecurity-1.0]. [AUDIT ] CWWKF0011I: The defaultServer server is ready to run a smarter planet. The defaultServer server started in 0.267 seconds. Unhandled exception Type=Illegal instruction vmState=0x00040000 J9Generic_Signal_Number=00000048 Signal_Number=00000004 Error_Value=00000000 Signal_Code=00000002 Handler1=00007FEE7DAFC3F0 Handler2=00007FEE7F0D4A80 RDI=00007FEDE8020BF0 RSI=00000000FF9BF148 RAX=0000000000000000 RBX=00007FEDE8020BF0 RCX=00007FEE3D437D00 RDX=0000000000000001 R8=00000000006FA000 R9=00000000078BFFFF R10=00000000FFF83203 R11=00000000209C01AB R12=00000000FF9BF148 R13=0000000000700DD8 R14=0000000000000040 R15=00007FEDE80074F0 RIP=00007FEE3D437FAB GS=0000 FS=0000 RSP=00007FEE3C749408 EFlags=0000000000010206 CS=0033 RBP=0000000000000040 ERR=0000000000000000 TRAPNO=0000000000000006 OLDMASK=0000000000000000 CR2=0000000000000000 xmm0 b3b69710b6cc9616 (f: 3066861056.000000, d: -1.405785e-59) xmm1 510e527f9b05688c (f: 2600823040.000000, d: 2.876275e+82) xmm2 1f83d9ab5be0cd19 (f: 1541459200.000000, d: 7.229008e-157) xmm3 427f527f7442667e (f: 1950508672.000000, d: 2.152449e+12) xmm4 72587944601d5f05 (f: 1612537600.000000, d: 6.527639e+242) xmm5 0445415f6f620263 (f: 1868694144.000000, d: 4.362171e-288) xmm6 3636363636363636 (f: 909522496.000000, d: 1.519791e-47) xmm7 0405060700010203 (f: 66051.000000, d: 2.696622e-289) xmm8 0405060700010203 (f: 66051.000000, d: 2.696622e-289) xmm9 0000000000000000 (f: 0.000000, d: 0.000000e+00) xmm10 1f83d9ab5be0cd19 (f: 1541459200.000000, d: 7.229008e-157) xmm11 0000000000000000 (f: 0.000000, d: 0.000000e+00) xmm12 0000000000000000 (f: 0.000000, d: 0.000000e+00) xmm13 0000000000000000 (f: 0.000000, d: 0.000000e+00) xmm14 0000000000000000 (f: 0.000000, d: 0.000000e+00) xmm15 0000000000000000 (f: 0.000000, d: 0.000000e+00) Module=/lib64/libcrypto.so.1.1 Module_base_address=00007FEE3D25B000 Target=2_90_20221018_61 (Linux 5.14.0-70.26.1.el9_0.x86_64) CPU=amd64 (4 logical CPUs) (0x1e04a0000 RAM) ----------- Stack Backtrace ----------- (0x00007FEE3D437FAB [libcrypto.so.1.1+0x1dcfab]) SHA256_Update+0x12d (0x00007FEE3D43B69D [libcrypto.so.1.1+0x1e069d]) Java_jdk_crypto_jniprovider_NativeCrypto_DigestUpdate+0x58 (0x00007FEE5C008988 [libjncrypto.so+0x3988]) ffi_call_unix64+0x52 (0x00007FEE7DC7D59A [libj9vm29.so+0x1e659a]) ffi_call_int+0x1a2 (0x00007FEE7DC7C732 [libj9vm29.so+0x1e5732]) _ZN32VM_BytecodeInterpreterCompressed3runEP10J9VMThread+0x142b8 (0x00007FEE7DB68878 [libj9vm29.so+0xd1878]) bytecodeLoopCompressed+0xad (0x00007FEE7DB545AD [libj9vm29.so+0xbd5ad]) (0x00007FEE7DC0B852 [libj9vm29.so+0x174852])
JVMDUMP039I Processing dump event "gpf", detail "" at 2022/11/15 00:48:13 - please wait. JVMPORT030W /proc/sys/kernel/core_pattern setting "|/usr/lib/systemd/systemd-coredump %P %u %g %s %t %c %h" specifies that the core dump is to be piped to an external program. Attempting to rename either core or core.39.
JVMDUMP012E Error in System dump: The core file created by child process with pid = 39 was not found. Expected to find core file with name "/opt/ol/wlp/output/defaultServer/core.39" JVMDUMP027W The requested heapdump has not been produced because another component is holding the VM exclusive lock. JVMDUMP032I JVM requested Java dump using '/opt/ol/wlp/output/defaultServer/javacore.20221115.004813.126.0004.txt' in response to an event JVMDUMP032I JVM requested System dump using '/opt/ol/wlp/output/defaultServer/core.20221115.004813.126.0002.dmp' in response to an event JVMDUMP010I Java dump written to /opt/ol/wlp/output/defaultServer/javacore.20221115.004813.126.0004.txt JVMPORT030W /proc/sys/kernel/core_pattern setting "|/usr/lib/systemd/systemd-coredump %P %u %g %s %t %c %h" specifies that the core dump is to be piped to an external program. Attempting to rename either core or core.46.