ibmruntimes / Semeru-Runtimes

Issue repo for all things IBM Semeru Runtimes
14 stars 4 forks source link

OpenJ9 crash on startup under aarch64 emulation on x86 #11

Closed sdwr98 closed 2 years ago

sdwr98 commented 2 years ago

Hello,

This is a followup to the issue raised in #5 - I'm seeing it as well. When running the aarch64 openj9 JVM in Docker on an x86 machine (doing cross-platform builds, for example) the JVM crashes on startup with the error

Error: Port Library failed to initialize: -1
Error: Could not create the Java Virtual Machine.
Error: A fatal exception has occurred. Program will exit.

This does not happen under Docker on a native aarch64 machine, however. It is reproducible with Java 8, 11, and 16. Running a docker build on an x86 machine with the following Dockerfile and build command docker build --platform=linux/arm64 -f Dockerfile . should trigger the issue:

FROM amazonlinux:2.0.20211201.0-with-sources

RUN  yum install -y alsa-lib dejavu-sans-fonts fontconfig freetype libX11 libXext libXi libXrender libXtst
RUN  rpm -i https://github.com/ibmruntimes/semeru8-binaries/releases/download/jdk8u312-b07_openj9-0.29.0/ibm-semeru-open-8-jdk-1.8.0.312.b07_0.29.0-1.aarch64.rpm
RUN  java -version

The regular OpenJDK builds (as provided in amazonlinux) run fine - it just appears to be the OpenJ9 builds.

pshipton commented 2 years ago

@knn-k fyi. If we can repeat this with a later JVM we'll get more info than -1 due to changes that have already gone in, and if necessary we can keep modifying the port library to narrow down the cause.

pshipton commented 2 years ago

@sdwr98 if you can repeat this using a nightly build that would help narrow it down. The latest is https://openj9-artifactory.osuosl.org/artifactory/ci-openj9/Build_JDK8_aarch64_linux_Nightly/47/OpenJ9-JDK8-aarch64_linux-20211206-021227.tar.gz

sdwr98 commented 2 years ago

With the latest nightly build, I get this now:

bash-4.2# ./java
Error: Port Library failed to initialize: -86
Error: Could not create the Java Virtual Machine.
Error: A fatal exception has occurred. Program will exit.
pshipton commented 2 years ago

-86 corresponds to OMRPORT_ERROR_STARTUP_SIGNAL_TOOLS10, which is failure to create the asynchSignalReporterThread. https://github.com/eclipse/omr/blob/master/port/unix/omrsignal.c#L1624-L1634

I'll see about additional changes to narrow it down further.

Trying to duplicate the problem myself, although I specified --platform=linux/arm64 I got the following.

 ---> [Warning] The requested image's platform (linux/arm64/v8) does not match the detected host platform (linux/amd64) and no specific platform was requested
 ---> Running in ad6abdc333f7
standard_init_linux.go:219: exec user process caused: exec format error
sdwr98 commented 2 years ago

Trying to duplicate the problem myself, although I specified --platform=linux/arm64 I got the following.

Do you have docker experimental features turned on? That's a weird error...

pshipton commented 2 years ago

Not sure what's wrong with my docker, but if you are willing to try builds I can create debug JVMs to track it down. I've added some debug prints in the following. Based on the results I will add more if necessary. https://openj9-artifactory.osuosl.org/artifactory/ci-openj9/Build_JDK8_aarch64_linux_Personal/32/OpenJ9-JDK8-aarch64_linux-20211206-203558.tar.gz

sdwr98 commented 2 years ago

Happy to help!

bash-4.2# ./bin/java
createThreadWithCategory 1073741830
Error: Port Library failed to initialize: -86
Error: Could not create the Java Virtual Machine.
Error: A fatal exception has occurred. Program will exit.
pshipton commented 2 years ago

1073741830 is 0x40000006 which means pthread_create() failed. Added more debug to get the failure code, pls try this one. https://openj9-artifactory.osuosl.org/artifactory/ci-openj9/Build_JDK8_aarch64_linux_Personal/33/OpenJ9-JDK8-aarch64_linux-20211206-231001.tar.gz

sdwr98 commented 2 years ago
bash-4.2# ./bin/java
osthread_create 38
createThreadWithCategory 1073741830
Error: Port Library failed to initialize: -86
Error: Could not create the Java Virtual Machine.
Error: A fatal exception has occurred. Program will exit.
pshipton commented 2 years ago

ENOSYS 38 /* Invalid system call number */ Never seen that before, but if docker doesn't support pthread_create() then OpenJ9 can't work.

sdwr98 commented 2 years ago

@pshipton Looks like this is coming from outside your control, so feel free to close this issue. Thank you so much for your quick attention and help with debugging! I wouldn't have gotten pointed in the right direction without it.

pshipton commented 2 years ago

I don't have permission to close it, but you can, or @AdamBrousseau can.