ThePrez / ServiceCommander-IBMi

Service Commander for IBM i
Apache License 2.0
40 stars 12 forks source link

JVM dump event "corruptcache". Seems to happen after IPL, once dump files are created, `sc` works fine. #214

Closed KerimG closed 1 year ago

KerimG commented 1 year ago

Our test system is shut down over night and started again in the morning and it the first sc call of the day causes a JVM dump:

gueney@ibmi:~ $ sc
JVMDUMP039I Processing dump event "corruptcache", detail "" at 2023/04/26 11:47:18 - please wait.
JVMDUMP032I JVM requested System dump using '/home/GUENEY/core.20230426.114718.279.0001.dmp' in response to an event
Note: "Enable full CORE dump" in smit is set to FALSE and as a result there will be limited threading information in core file.
JVMDUMP010I System dump written to /home/GUENEY/core.20230426.114718.279.0001.dmp
JVMDUMP032I JVM requested Java dump using '/home/GUENEY/javacore.20230426.114718.279.0002.txt' in response to an event
JVMDUMP010I Java dump written to /home/GUENEY/javacore.20230426.114718.279.0002.txt
JVMDUMP032I JVM requested Snap dump using '/home/GUENEY/Snap.20230426.114718.279.0003.trc' in response to an event
JVMDUMP010I Snap dump written to /home/GUENEY/Snap.20230426.114718.279.0003.trc
JVMDUMP013I Processed dump event "corruptcache", detail "".

gueney@ibmi:~ $ ls -lth
total 289M
drwxr-sr-x 2 gueney 0  12K Apr 26 11:47 javasharedresources
-rw-r--r-- 1 gueney 0  75K Apr 26 11:47 Snap.20230426.114718.279.0003.trc
-rw-r--r-- 1 gueney 0 274M Apr 26 11:47 core.20230426.114718.279.0001.dmp
-rw-r--r-- 1 gueney 0  53K Apr 26 11:47 javacore.20230426.114718.279.0002.txt

To Reproduce Execute IPL then call sc

Expected behavior sc running without causing a dump.

Verbose output Argh, of course. Will update with verbose output.

Additional info I just realized I'm running an early access version of openjdk-11 but don't see a non ea version of it available:

gueney@ibmi:~ $ yum info openjdk*
Installed Packages
Name        : openjdk-11-ea
Arch        : ppc64
Version     : 11.0.15.10
Release     : 3
Size        : 468 M
Repo        : installed
From repo   : ibmi-base
Summary     : OpenJDK 11 with OpenJ9" (Early Access)"
URL         : https://github.com/ibmruntimes/openj9-openjdk-jdk11/
License     : GPL-2.0-or-later AND EPL-2.0
Description : OpenJDK 11 with OpenJ9" (Early Access)"
            :
            :
            : This is an early access version. It is not recommended for production use.

Available Packages
Name        : openjdk-11-jmods-ea
Arch        : ppc64
Version     : 11.0.15.10
Release     : 3
Size        : 82 M
Repo        : ibmi-base
Summary     : The JMods for OpenJDK 11
URL         : https://github.com/ibmruntimes/openj9-openjdk-jdk11/
License     : GPL-2.0-or-later AND EPL-2.0
Description : The JMods for OpenJDK 11

Name        : openjdk-11-src-ea
Arch        : ppc64
Version     : 11.0.15.10
Release     : 3
Size        : 53 M
Repo        : ibmi-base
Summary     : OpenJDK Source Bundle 11
URL         : https://github.com/ibmruntimes/openj9-openjdk-jdk11/
License     : GPL-2.0-or-later AND EPL-2.0
Description : This package contains the complete OpenJDK 11 class library source
            : code for use by IDE indexers and debuggers.
ThePrez commented 1 year ago

That ea version should be fine. I have never seen the corruptcache failure type before!

Can you send me /home/GUENEY/javacore.20230426.114718.279.0002.txt ? (Privately if need be)

ThePrez commented 1 year ago

/home/GUENEY/core.20230426.114718.279.0001.dmp would be very valuable also

KerimG commented 1 year ago

Hey @ThePrez , thanks. Sent you two links over Ryver. There's no urgency or anything, since the command works just fine after the initial hiccup but it is a little weird and thought you might want to know. May I ask what kind of tooling you use to analyze the .dmp ?

KerimG commented 1 year ago

Okay, something interesting happened. Today morning, when I tried to call sc -v, the command and every subsequent call of sc worked just fine.

Yesterday, we had another issue on the system, I assumed that they're unrelated but now I'm not so sure. We had problems with yum, it was throwing

gueney@ibmi:~ $ yum check
error: db4 error(22) from dbenv->open: invalid argument 
error: cannot open packages index using db3 - invalid argument (22)
# and more but I don't have the remaining error anywhere

I decided to fix it by just deleting the rpm dbs and rebuilding them:

gueney@ibmi:~ $ rm -f /QOpenSys/var/lib/rpm/__db*
gueney@ibmi:~ $ rpm --rebuilddb

which fixed yum and also MAYBE the JVM problem?

Or maybe the non-verbose sc call is bugged and the verbose one isn't? I guess we'll see tomorrow.

KerimG commented 1 year ago

Okay. It looks like those yum/rpm issues were the cause of this, because today morning everything works fine again.