Open tam512 opened 1 year ago
The native stack trace seems similar to
@tam512 is it reproduceable? if you have the diagnostic files (system core) it will help us investigate the issue.
I sugeest doing something like sudo bash -c 'echo "core" > /proc/sys/kernel/core_pattern'
on your machine so the core file can be produced.
refer to eclipse openj9 issue https://github.com/eclipse-openj9/openj9/issues/18115
To get system core files for this defect, follow https://eclipse.dev/openj9/docs/xdump/#piped-system-dumps to pipe core dumps to the worker node using systemd-coredump
Notes that there are 2 known openshift issues when copying the core files to local mcahine: One is 1358 that prevent copying file with \
. Two is error: unexpected EOF
when copying directory so need to run use oc cp --retries=-1
....
For example: oc cp --retries=-1 worker0-xxx-debug:/host/var/lib/systemd/coredump/ coredump
This is a bug in the JVM, specific to Power 10, and not related to InstantOn.
It seems a fix for this problem was merged into the Semeru development stream for their next release a few weeks back (there was a failure in a scenario unrelated to InstantOn). The next question would be how Liberty SVT plans to obtain a fixed Semeru build to verify that this works in the environment being tested.
Deploy checkpoint app image to Power10 OCP 4.13 cluster, when running the application, it failed and the app log has the following errors
Below is the full log:
% oc exec ebuy-olf-j17-0 -- ls -l /opt/ol/wlp/output/defaultServer/ total 0 drwxrwx---. 1 default root 19 Sep 8 21:52 logs drwxrwx---. 1 default root 22 Sep 8 21:52 resources drwxrwx---. 1 default root 100 Sep 8 22:57 workarea %