eclipse-openj9 / openj9

Eclipse OpenJ9: A Java Virtual Machine for OpenJDK that's optimized for small footprint, fast start-up, and high throughput. Builds on Eclipse OMR (https://github.com/eclipse/omr) and combines with the Extensions for OpenJDK for OpenJ9 repo.
Other
3.28k stars 721 forks source link

JDK 18 - Crash using the Panama FFI API #14205

Closed Thihup closed 1 year ago

Thihup commented 2 years ago

Java -version output

openjdk 18-internal 2022-03-22 OpenJDK Runtime Environment (build 18-internal+0-adhoc.jenkins.BuildJDK18x86-64linuxNightly) Eclipse OpenJ9 VM (build master-202f718b04f, JRE 18 Linux amd64-64-Bit Compressed References 20211223_8 (JIT enabled, AOT enabled) OpenJ9 - 202f718b04f OMR - 7589ce4381b JCL - 4f958c12587 based on jdk-18+28)

Summary of problem

I have created a sample program using the Panama FFI API using the JDK 18 from Hotspot. After that, I was curious to try out the nightly builds from JDK 18 from OpenJ9. The program execution works just fine, but when I click to end the game, the JVM crashes. I could not understand why.

The project: https://github.com/Thihup/super-thiagout

mvn package
java -Djava.library.path="/usr/lib/x86_64-linux-gnu" --enable-native-access=dev.thihup.superthiagout -p target/super-thiagout-1.0-SNAPSHOT.jar -m dev.thihup.superthiagout/dev.thihup.superthiagout.Main

The project needs the SDL 1.2 library. You can install in Ubuntu using:

sudo apt install libsdl1.2-dev

Diagnostic files

Stderr

JVMDUMP039I Processing dump event "traceassert", detail "" at 2021/12/28 10:50:08 - please wait.
JVMDUMP032I JVM requested System dump using '/workspace/super-thiagout/core.20211228.105008.1000627.0001.dmp' in response to an event
JVMPORT030W /proc/sys/kernel/core_pattern setting "|/usr/share/apport/apport %p %s %c %d %P %E" specifies that the core dump is to be piped to an external program.  Attempting to rename either core or core.1000653.

JVMDUMP012E Error in System dump: The core file created by child process with pid = 1000653 was not found. Expected to find core file with name "/workspace/super-thiagout/core"
JVMDUMP032I JVM requested Java dump using '/workspace/super-thiagout/javacore.20211228.105008.1000627.0002.txt' in response to an event
JVMDUMP010I Java dump written to /workspace/super-thiagout/javacore.20211228.105008.1000627.0002.txt
JVMDUMP032I JVM requested Snap dump using '/workspace/super-thiagout/Snap.20211228.105008.1000627.0003.trc' in response to an event
JVMDUMP010I Snap dump written to /workspace/super-thiagout/Snap.20211228.105008.1000627.0003.trc
JVMDUMP013I Processed dump event "traceassert", detail "".

logs.zip

pshipton commented 2 years ago

3XEHSTTYPE 13:50:08:033430000 GMT omrport.359 - * ASSERTION FAILED at /home/jenkins/workspace/Build_JDK18_x86-64_linux_Nightly/omr/port/common/omrmemtag.c:145: ((memoryCorruptionDetected))

3XMTHREADINFO3           Java callstack:
4XESTACKTRACE                at jdk/internal/misc/Unsafe.freeMemory0(Native Method)
4XESTACKTRACE                at jdk/internal/misc/Unsafe.freeMemory(Unsafe.java:1438)
4XESTACKTRACE                at jdk/internal/foreign/NativeMemorySegmentImpl$1.cleanup(NativeMemorySegmentImpl.java:118)
4XESTACKTRACE                at jdk/internal/foreign/ResourceScopeImpl$ResourceList.cleanup(ResourceScopeImpl.java:253)
4XESTACKTRACE                at jdk/internal/foreign/ConfinedScope$ConfinedResourceList.cleanup(ConfinedScope.java:134)
4XESTACKTRACE                at jdk/internal/foreign/ResourceScopeImpl.close(ResourceScopeImpl.java:137)
4XESTACKTRACE                at dev/thihup/superthiagout/Main.main(Main.java:163)
pshipton commented 2 years ago

@ChengJin01 @tajila

ChengJin01 commented 2 years ago

@Thihup,

1) Please advise how to reproduce the issue you encountered with your program as it ended up with a bunch of compilation errors on my side even with OpenJDK18/Hotspot as follow:

$ java -version
openjdk version "18-ea" 2022-03-15
OpenJDK Runtime Environment (build 18-ea+25-1670)
OpenJDK 64-Bit Server VM (build 18-ea+25-1670, mixed mode, sharing)

$ mvn package
[ERROR] Error executing Maven.
[ERROR] java.lang.IllegalStateException: Unable to load cache item
[ERROR] Caused by: Unable to load cache item
[ERROR] Caused by: Could not initialize class com.google.inject.internal.cglib.core.$MethodWrapper
[ERROR] Caused by: Exception com.google.inject.internal.cglib.core.$CodeGenerationException: java.lang.reflect.InaccessibleObjectException-->Unable to make protected final java.lang.Class java.lang.ClassLoader.defineClass(java.lang.String,byte[],int,int,java.security.ProtectionDomain) throws java.lang.ClassFormatError accessible: module java.base does not "opens java.lang" to unnamed module @49c43f4e [in thread "main"]

2) The existing implementation only support primitive types. So please don't verify any struct specific FFI APIs as we have not yet merged the code into the repo (I notice the related code in your program at https://github.com/Thihup/super-thiagout/blob/main/src/main/java/dev/thihup/superthiagout/sdl/SDL.java) e.g.

        public static final GroupLayout SDL_ACTIVE_EVENT_MEMORY_LAYOUT = MemoryLayout.structLayout(
            ValueLayout.JAVA_BYTE.withName("type"),
            ValueLayout.JAVA_BYTE.withName("gain"),
            ValueLayout.JAVA_BYTE.withName("state")
        );
Thihup commented 2 years ago

Hi @ChengJin01!

About 1: Could you try using Maven 3.8.3? I think this error happens when using Maven 3.6.x.

About 2: Could you give me some examples of how to use only primitives? 😅

ChengJin01 commented 2 years ago

@Thihup,

1) Thanks for reminder. I can reproduce the crash with the latest version of Maven (3.8.4) 2) we have many test cases intended for primitive types at https://github.com/eclipse-openj9/openj9/tree/master/test/functional/Java18andUp/src/org/openj9/test/jep419/downcall.

ChengJin01 commented 2 years ago

Our FFI implementation (including the downcall symbol which is allocated in native and released on our own) doesn't rely on ResourceScope to release the memory in cleanup() but the java trace above indicates the crash occurred via cleanup(), which most likely has nothing to do with FFI but the code within Foreign Memory APIs.

The memories to be released via cleanup() were allocated in the try block of ResourceScope at https://github.com/Thihup/super-thiagout/blob/main/src/main/java/dev/thihup/superthiagout/Main.java

  public static void main(String[] args) throws Throwable {
...
        try (ResourceScope resourceScope = ResourceScope.newConfinedScope()) { <------
                  SegmentAllocator segmentAllocator = SegmentAllocator.nativeAllocator(resourceScope);
            Addressable event = segmentAllocator.allocate(SDL_EVENT_MEMORY_LAYOUT);
      ...
      }

which means there were a couple of variables allocated via segmentAllocator were automatically released via cleanup().

FYI: @babsingh, @EricYangIBM

ChengJin01 commented 2 years ago

@Thihup,

Please help to simplify your program for us to easily isolate the failing part if possible (e.g. whether only a single downcall can trigger the issue) given it includes too many downcalls in SDL.

Thihup commented 2 years ago

@ChengJin01 Sure, I'll try to simplify it. Sorry about the confusion with FFI and Foreign Memory API

Thihup commented 2 years ago

@ChengJin01 Tried my best to simplify it, but it still requires some SDL calls.

https://github.com/Thihup/super-thiagout/tree/openj9-bug

One thing I noticed is that the error doesn't happen when I change from

SegmentAllocator segmentAllocator = SegmentAllocator.nativeAllocator(resourceScope);

to

SegmentAllocator segmentAllocator = SegmentAllocator.newNativeArena(resourceScope);

I actually don't know the difference between them, I used the first one and it worked in Hotspot so I kept that, but maybe I have used the wrong one.

ChengJin01 commented 2 years ago

@Thihup, it is hard to tell (in theory it should work for both cases) as it depends on how many code was modified/re-implemented in native by our guys working on the Foreign Memory APIs, in which case they might need to further investigate to see what happened to the the differences.

ChengJin01 commented 1 year ago

I just verified the failing application (at https://github.com/Thihup/super-thiagout) with the latest nightly build in JDK19 which works fine as follows:

$ java --enable-preview -Djava.library.path="/usr/lib/x86_64-linux-gnu" --enable-native-access=dev.thihup.superthiagout -p target/super-thiagout-1.0-SNAPSHOT.jar -m dev.thihup.superthiagout/dev.thihup.superthiagout.Main image $ (no crash occurred after clicking to end the game).

So I believe the problem with struct was fixed given we already implemented everything including primitive & struct in OpenJ9.

ChengJin01 commented 1 year ago

Hi @Thihup, please double-check your application with the JDK19 nightly build to see how it goes. I will close this issue as resolved once your confirm that.

Thihup commented 1 year ago

Hi @ChengJin01 Where can I get the JDK19 nightly build?

ChengJin01 commented 1 year ago

Hi @ChengJin01 Where can I get the JDK19 nightly build?

https://openj9-artifactory.osuosl.org/artifactory/ci-openj9/Build_JDK19_x86-64_linux_Nightly/136/OpenJ9-JDK19-x86-64_linux-20221213-204920.tar.gz

Thihup commented 1 year ago

@ChengJin01 It worked just fine! Thank you!