eclipse-openj9 / openj9

Eclipse OpenJ9: A Java Virtual Machine for OpenJDK that's optimized for small footprint, fast start-up, and high throughput. Builds on Eclipse OMR (https://github.com/eclipse/omr) and combines with the Extensions for OpenJDK for OpenJ9 repo.
Other
3.27k stars 721 forks source link

JVM_LoadLibrary should handle .a files on AIX #19344

Closed pshipton closed 1 month ago

pshipton commented 4 months ago

Re https://github.com/ibmruntimes/openj9-openjdk-jdk/pull/776#pullrequestreview-2009347916 we should update JVM_LoadLibrary on AIX to handle .a files to match the RI, and then remove the ClassLoaderHelper.mapAlternativeName() impl.

JasonFengJ9 commented 4 months ago

https://github.com/ibmruntimes/openj9-openjdk-jdk/blob/e403086080c657581f605c362417caeb6f879fdc/src/java.base/aix/classes/jdk/internal/loader/ClassLoaderHelper.java#L46-L48

AIX implementation of JVM_LoadLibrary handles the alternate path name mapping. If loading of the given library name with ".so" suffix fails, it will attempt to load the library of the same name with ".a" suffix as the alternate name.

https://github.com/eclipse-openj9/openj9/blob/7f82a7d9d55a0e22c209582b9d92654a01a1df7c/runtime/j9vm/jvm.c#L3948 invokes https://github.com/eclipse-openj9/openj9/blob/7f82a7d9d55a0e22c209582b9d92654a01a1df7c/runtime/j9vm/jvm.c#L3989 which calls omr/port/aix/omrsl.c:omrsl_open_shared_library in which lib%s.a will be attempted if "lib%s" PLATFORM_DLL_EXTENSION (.so) loading failed. Attempt .so first - https://github.com/eclipse-openj9/openj9-omr/blob/9083c8237ac215927ac55b5db256780132983136/port/aix/omrsl.c#L236-L241

    pathLength = portLibrary->str_printf(portLibrary, mangledName, (EsMaxPath + 1), "%.*slib%s" PLATFORM_DLL_EXTENSION, (uintptr_t)fileName + 1 - (uintptr_t)name, name, fileName + 1);
} else {
    pathLength = portLibrary->str_printf(portLibrary, mangledName, (EsMaxPath + 1), "lib%s" PLATFORM_DLL_EXTENSION, name);

If the loading above failed, try .a afterwards - https://github.com/eclipse-openj9/openj9-omr/blob/9083c8237ac215927ac55b5db256780132983136/port/aix/omrsl.c#L310-L315

    pathLength = portLibrary->str_printf(portLibrary, mangledName, (EsMaxPath + 1), "%.*slib%s.a", (uintptr_t)fileName + 1 - (uintptr_t)name, name, fileName + 1);
} else {
    pathLength = portLibrary->str_printf(portLibrary, mangledName, (EsMaxPath + 1), "lib%s.a", name);

OpenJ9 JVM_LoadLibrary already handles both .so and .a.

pshipton commented 4 months ago

OpenJ9 JVM_LoadLibrary already handles both .so and .a

Does that mean there is nothing to do other than remove the ClassLoaderHelper.mapAlternativeName() impl?

JasonFengJ9 commented 4 months ago

OpenJ9 JVM_LoadLibrary already handles both .so and .a

Does that mean there is nothing to do other than remove the ClassLoaderHelper.mapAlternativeName() impl?

I expect so. https://github.com/ibmruntimes/openj9-openjdk-jdk/commit/04c93d585c7589e89caa70e61970924021da4480 was authored by @suchismith1993, Suchi, could you comment on if OpenJ9 JCL patch ClassLoaderHelper.mapAlternativeName() impl can be removed and use the upstream version instead?

    static File mapAlternativeName(File lib) {
        String name = lib.toString();
        int index = name.lastIndexOf('.');
        if (index < 0) {
            return null;
        }
        return new File(name.substring(0, index) + ".a");
    }
pshipton commented 4 months ago

Well it can't be removed, we'd change the OpenJ9 mapAlternativeName() to return null.

pshipton commented 4 months ago

Unless the OpenJ9 mapAlternativeName() is the only diff from upstream.

keithc-ca commented 4 months ago

I agree with @JasonFengJ9: it appears that OMR tries loading from a .a archive if a .so can't be loaded. After verifying, we should be able to adopt the AIX variant of ClassLoaderHelper from upstream.

JasonFengJ9 commented 4 months ago

I had a personal build w/o the OpenJ9 JCL patch. Attempted to run the newly added OpenJDK test - https://github.com/ibmruntimes/openj9-openjdk-jdk/blob/openj9/test/jdk/java/lang/RuntimeTests/loadLibrary/aix/LoadAIXLibraryFromArchiveObject.java There were some issues with the grinder git clone, current grinder waiting for the machines

keithc-ca commented 4 months ago

That doesn't look good:

14:21:59  attempting to load library foobar
14:21:59  Exception in thread "main" java.lang.UnsatisfiedLinkError: Can't load foobar
14:21:59    at java.base/java.lang.ClassLoader.loadLibrary(ClassLoader.java:1824)
14:21:59    at java.base/java.lang.System.loadLibrary(System.java:806)
14:21:59    at LoadAIXLibraryFromArchiveObject$LoadLibraryApp.main(LoadAIXLibraryFromArchiveObject.java:67)
JasonFengJ9 commented 4 months ago

Yeah, I saw that as well, running another grinder with an earlier build w/o the change - passed. There are no temurin RI JDK23 AIX builds.

pshipton commented 4 months ago

I've created one from the latest source, df04358223e.

openjdk version "23-internal" 2024-09-17
OpenJDK Runtime Environment (build 23-internal-adhoc.jenkins.jdk)
OpenJDK 64-Bit Server VM (build 23-internal-adhoc.jenkins.jdk, mixed mode)

It's on paix822 (/home/jenkins/peter/jdk/build/aix-ppc64-server-release/images/jdk), and I believe paix822 is the only machine it will run on since it needs the XL C 17 Runtime. I'll download it and upload it to jenkins, but that will take some time.

pshipton commented 4 months ago

https://hyc-runtimes-jenkins.swg-devops.com/view/J9-infra/job/Upload-To-UserContent/7

JasonFengJ9 commented 4 months ago

Grinder w/ latest internal build w/o change - passed

suchismith1993 commented 4 months ago

OpenJ9 JVM_LoadLibrary already handles both .so and .a

Does that mean there is nothing to do other than remove the ClassLoaderHelper.mapAlternativeName() impl?

I expect so. ibmruntimes/openj9-openjdk-jdk@04c93d5 was authored by @suchismith1993, Suchi, could you comment on if OpenJ9 JCL patch ClassLoaderHelper.mapAlternativeName() impl can be removed and use the upstream version instead?

    static File mapAlternativeName(File lib) {
        String name = lib.toString();
        int index = name.lastIndexOf('.');
        if (index < 0) {
            return null;
        }
        return new File(name.substring(0, index) + ".a");
    }

In addition to removing implementaion, need to set loadLibraryOnlyIfPresent to false. That way the control is passed to the native code to handle the cases for AIX dynamic libraries.

JasonFengJ9 commented 4 months ago

In addition to removing implementaion, need to set loadLibraryOnlyIfPresent to false. That way the control is passed to the native code to handle the cases for AIX dynamic libraries.

Is loadLibraryOnlyIfPresent (set to false) at https://github.com/ibmruntimes/openj9-openjdk-jdk/blob/a2f1462ae3fb2aca3b4bb177f04277c145c7771b/src/java.base/aix/classes/jdk/internal/loader/ClassLoaderHelper.java#L47 ?

    static boolean loadLibraryOnlyIfPresent() {
        return false;
    }

This doesn't pass the LoadAIXLibraryFromArchiveObject.java, it seems OpenJ9 change is still required to adopt upstream version.

tajila commented 2 months ago

@JasonFengJ9 What is the status of this issue?

JasonFengJ9 commented 1 month ago

@tajila Apologies, this was distracted by other items, will make it a priority.

ChengJin01 commented 1 month ago

The issue isn't really resolved as we still ended up with failure to load .a library when compiling the FFI jextract tool on AIX.

To be specific, I tried to build the jextract tool on AIX by following the instructions at https://github.com/openjdk/jextract?tab=readme-ov-file#building but it failed in creating the jextract image as follows:

> Task :createJextractImage

> Task :verify FAILED
Exception in thread "main" java.lang.ExceptionInInitializerError
        at org.openjdk.jextract@22/org.openjdk.jextract.clang.LibClang.<clinit>(Unknown Source) <-----
        at org.openjdk.jextract@22/org.openjdk.jextract.impl.Parser.parse(Unknown Source)
        at org.openjdk.jextract@22/org.openjdk.jextract.JextractTool.parseInternal(Unknown Source)
        at org.openjdk.jextract@22/org.openjdk.jextract.JextractTool.run(Unknown Source)
        at org.openjdk.jextract@22/org.openjdk.jextract.JextractTool.main(Unknown Source)
Caused by: java.lang.UnsatisfiedLinkError: Failed to load library ("/home/jenkins/jchau_ffi/jextract/build/jextract/runtime/lib/libclang.a")  
0509-022 Cannot load module /home/jenkins/jchau_ffi/jextract/build/jextract/runtime/lib/libclang.a. <---------
        0509-026 System error: Cannot run a file that does not have a valid format.
        at java.base/jdk.internal.loader.NativeLibraries.load(Native Method) <--------
        at java.base/jdk.internal.loader.NativeLibraries$NativeLibraryImpl.open(Unknown Source)
        at java.base/jdk.internal.loader.NativeLibraries.loadLibrary(Unknown Source)
        at java.base/jdk.internal.loader.NativeLibraries.loadLibrary(Unknown Source)
        at java.base/jdk.internal.loader.NativeLibraries.findFromPaths(Unknown Source)
        at java.base/jdk.internal.loader.NativeLibraries.loadLibrary(Unknown Source)
        at java.base/java.lang.ClassLoader.loadLibrary(Unknown Source)
        at java.base/java.lang.System.loadLibrary(Unknown Source)
        at org.openjdk.jextract@22/org.openjdk.jextract.clang.libclang.Index_h.<clinit>(Unknown Source)
        ... 5 more

So I tried to narrow down to the failing code in jextract at https://github.com/openjdk/jextract/blob/e55aacfda23211406552357095b6eb780f0fa8e5/src/main/java/org/openjdk/jextract/clang/LibClang.java#L91 https://github.com/openjdk/jextract/blob/e55aacfda23211406552357095b6eb780f0fa8e5/src/main/java/org/openjdk/jextract/clang/libclang/CXString.java#L55 https://github.com/openjdk/jextract/blob/e55aacfda23211406552357095b6eb780f0fa8e5/src/main/java/org/openjdk/jextract/clang/libclang/Index_h.java#L86

static {
        String libName = System.getProperty("os.name").startsWith("Windows") ? "libclang" : "clang";
        System.loadLibrary(libName); <----------- commented out
    }

and commented it out to avoid loading the libclang.a on AIX but it still ended up with failure to locate clang_createIndex (which exists in libclang.a as it skipped loading libclang.a

Exception in thread "main" java.lang.UnsatisfiedLinkError: unresolved symbol: clang_createIndex <------- the native function specified in libclang.a
        at org.openjdk.jextract@22/org.openjdk.jextract.clang.libclang.Index_h.lambda$findOrThrow$0(Unknown Source)
        at java.base/java.util.Optional.orElseThrow(Unknown Source)
        at org.openjdk.jextract@22/org.openjdk.jextract.clang.libclang.Index_h.findOrThrow(Unknown Source)
        at org.openjdk.jextract@22/org.openjdk.jextract.clang.libclang.Index_h$clang_createIndex.<clinit>(Unknown Source)
        at org.openjdk.jextract@22/org.openjdk.jextract.clang.libclang.Index_h.clang_createIndex(Unknown Source)
        at org.openjdk.jextract@22/org.openjdk.jextract.clang.LibClang.createIndex(Unknown Source)
        at org.openjdk.jextract@22/org.openjdk.jextract.impl.Parser.parse(Unknown Source)
        at org.openjdk.jextract@22/org.openjdk.jextract.JextractTool.parseInternal(Unknown Source)
        at org.openjdk.jextract@22/org.openjdk.jextract.JextractTool.run(Unknown Source)
        at org.openjdk.jextract@22/org.openjdk.jextract.JextractTool.main(Unknown Source)

Looking at the symbol table in libclang.a

/home/jenkins/jchau_ffi/jextract/build/jmod_inputs/libs/libclang.a[libclang.so.18.1]:
                        ***Loader Section***
                        ***Loader Symbol Table Information***
[Index]      Value      Scn     IMEX Sclass   Type           IMPid Name

[0]     0x11041a670    .data      EXP     RW SECdef        [noIMid] __cdtors
[1]     0x11041d020    .data              RW SECdef        [noIMid] __priority0x80000000
[2]     0x00000000    undef      IMP     UA EXTref libc.a(shr_64.o) environ
[3]     0x00000000    undef      IMP     UA EXTref libc.a(shr_64.o) errno
[4]     0x00000000    undef      IMP     DS EXTref libc.a(shr_64.o) access
[5]     0x00000000    undef      IMP     DS EXTref libc.a(shr_64.o) chdir
[6]     0x00000000    undef      IMP     DS EXTref libc.a(shr_64.o) close
...
[342]   0x1104e30c0    .data      EXP     DS SECdef        [noIMid] clang_isStatement
[343]   0x1104e30d8    .data      EXP     DS SECdef        [noIMid] clang_isExpression
[344]   0x1104e30f0    .data      EXP     DS SECdef        [noIMid] clang_isTranslationUnit
[345]   0x1104e3108    .data      EXP     DS SECdef        [noIMid] clang_isAttribute
[346]   0x1104e3120    .data      EXP     DS SECdef        [noIMid] clang_createIndex  <---------------------
[347]   0x1104e31c8    .data      EXP     DS SECdef        [noIMid] clang_disposeIndex

That being said, the library loading issue still occurs on AIX, which we can't skip as required in compiling jextract.

keithc-ca commented 1 month ago

Can you try running your test case with the new tracepoints enabled so we can understand what's happening?

ChengJin01 commented 1 month ago

Can you try running your test case with the new tracepoints enabled so we can understand what's happening?

This has nothing to do with any OpenJ9/other test case but the jextract compilation guided by https://github.com/openjdk/jextract?tab=readme-ov-file#building (builded by using the gradle / I am not sure whether it is possible to add tracepoints some where in the framework setting/scripts).

JasonFengJ9 commented 1 month ago

0509-022 Cannot load module /home/jenkins/jchau_ffi/jextract/build/jextract/runtime/lib/libclang.a. <--------- 0509-026 System error: Cannot run a file that does not have a valid format.

Can we run the following code snippet in a standalone testcase?

System.load("/home/jenkins/jchau_ffi/jextract/build/jextract/runtime/lib/libclang.a");

Does OpenJ9 throw UnsatisfiedLinkError? Does RI throw UnsatisfiedLinkError?

In addition, try the following code snippet in a standalone testcase with a command line option -Djava.library.path=/home/jenkins/jchau_ffi/jextract/build/jextract/runtime/lib

System.loadLibrary("clang");

What's the test output? Does RI pass?

ChengJin01 commented 1 month ago

In addition, try the following code snippet in a standalone testcase with a command line option -Djava.library.path=/home/jenkins/jchau_ffi/jextract/build/jextract/runtime/lib ...

I tried the following test case on JDK17 & JDK22

Public class LibTest {
   static {
     System.loadLibrary("clang");
     System.out.println("passed");
   }
    public static void main( String []args ) {
        System.out.println("lib loading test");
    }
}

(1) JDK17: (required by gradle in building the jextract tool)

-bash-5.0$ ../jdk17_openj9_ffi_cmk_aix_ppc64/bin/javac  LibTest.java
-bash-5.0$ ../jdk17_openj9_ffi_cmk_aix_ppc64/bin/java -Djava.library.path=/home/jenkins/jchau_ffi/jextract/build/jextract/runtime/lib LibTest
Exception in thread "main" java.lang.UnsatisfiedLinkError: Failed to load library ("/home/jenkins/jchau_ffi/jextract/build/jextract/runtime/lib/libclang.a") 
0509-022 Cannot load module /home/jenkins/jchau_ffi/jextract/build/jextract/runtime/lib/libclang.a.
        0509-026 System error: Cannot run a file that does not have a valid format.
        at java.base/jdk.internal.loader.NativeLibraries.load(Native Method)
        at java.base/jdk.internal.loader.NativeLibraries$NativeLibraryImpl.open(NativeLibraries.java:395)
        at java.base/jdk.internal.loader.NativeLibraries.loadLibrary(NativeLibraries.java:239)
        at java.base/jdk.internal.loader.NativeLibraries.loadLibrary(NativeLibraries.java:181)
        at java.base/jdk.internal.loader.NativeLibraries.findFromPaths(NativeLibraries.java:328)
        at java.base/jdk.internal.loader.NativeLibraries.loadLibrary(NativeLibraries.java:294)
        at java.base/java.lang.ClassLoader.loadLibrary(ClassLoader.java:1800)
        at java.base/java.lang.System.loadLibrary(System.java:770)
        at LibTest.<clinit>(LibTest.java:3)

(2) JDK22:

-bash-5.0$ ../jdk22_openj9_ffi_cmk_aix_ppc64/bin/javac  LibTest.java
-bash-5.0$ ../jdk22_openj9_ffi_cmk_aix_ppc64/bin/java -Djava.library.path=/home/jenkins/jchau_ffi/jextract/build/jextract/runtime/lib LibTest
-bash-5.0$ (NO output /still got stuck in System.loadLibrary("clang"))

Unfortunately, there is no RI (JDK17+) for reference on AIX.

JasonFengJ9 commented 1 month ago

What's the output for the following code snippet?

System.load("/home/jenkins/jchau_ffi/jextract/build/jextract/runtime/lib/libclang.a");

Unfortunately, there is no RI (JDK17+) for reference on AIX.

RI AIX JDK17 - https://github.com/adoptium/temurin17-binaries/releases/download/jdk-17.0.12%2B7/OpenJDK17U-jdk_ppc64_aix_hotspot_17.0.12_7.tar.gz RI AIX JDK22 - https://github.com/adoptium/temurin22-binaries/releases/download/jdk-22.0.2%2B9/OpenJDK22U-jdk_ppc64_aix_hotspot_22.0.2_9.tar.gz

ChengJin01 commented 1 month ago

What's the output for the following code snippet? System.load("/home/jenkins/jchau_ffi/jextract/build/jextract/runtime/lib/libclang.a");

Tried the code as follows:

public class LibTest2 {
   static {
 System.load("/home/jenkins/jchau_ffi/jextract/build/jextract/runtime/lib/libclang.a");
System.out.println("passed");
   }
    public static void main( String []args ) {
        System.out.println("lib loading test");
    }
}

and still ended up with the same result.

-bash-5.0$ ../jdk17_openj9_ffi_cmk_aix_ppc64/bin/javac  LibTest2.java
-bash-5.0$ ../jdk17_openj9_ffi_cmk_aix_ppc64/bin/java  LibTest2
Exception in thread "main" java.lang.UnsatisfiedLinkError: Failed to load library
 ("/home/jenkins/jchau_ffi/jextract/build/jextract/runtime/lib/libclang.a")
0509-022 Cannot load module /home/jenkins/jchau_ffi/jextract/build/jextract/runtime/lib/libclang.a.
        0509-026 System error: Cannot run a file that does not have a valid format.
        at java.base/jdk.internal.loader.NativeLibraries.load(Native Method)
        at java.base/jdk.internal.loader.NativeLibraries$NativeLibraryImpl.open(NativeLibraries.java:395)
        at java.base/jdk.internal.loader.NativeLibraries.loadLibrary(NativeLibraries.java:239)
        at java.base/jdk.internal.loader.NativeLibraries.loadLibrary(NativeLibraries.java:181)
        at java.base/java.lang.ClassLoader.loadLibrary(ClassLoader.java:1770)
        at java.base/java.lang.System.load(System.java:741)
        at LibTest2.<clinit>(LibTest2.java:3)
-bash-5.0$
-bash-5.0$ ../jdk22_openj9_ffi_cmk_aix_ppc64/bin/javac  LibTest2.java
-bash-5.0$ ../jdk22_openj9_ffi_cmk_aix_ppc64/bin/java  LibTest2 (no output)
-bash-5.0$
ChengJin01 commented 1 month ago

RI AIX JDK17 - https://github.com/adoptium/temurin17-binaries/releases/download/jdk-17.0.12%2B7/OpenJDK17U-jdk_ppc64_aix_hotspot_17.0.12_7.tar.gz RI AIX JDK22 - https://github.com/adoptium/temurin22-binaries/releases/download/jdk-22.0.2%2B9/OpenJDK22U-jdk_ppc64_aix_hotspot_22.0.2_9.tar.gz

(1) JDK17/Hotspot: (the same failure as OpenJ9)

-bash-5.0$ jdk17_hotspot_aix_ppc64/bin/java -version
openjdk version "17.0.12" 2024-07-16
OpenJDK Runtime Environment Temurin-17.0.12+7 (build 17.0.12+7)
OpenJDK 64-Bit Server VM Temurin-17.0.12+7 (build 17.0.12+7, mixed mode)

-bash-5.0$ ../jdk17_hotspot_aix_ppc64/bin/java -Djava.library.path=/home/jenkins/jchau_ffi/jextract/build/jextract/runtime/lib LibTest
Exception in thread "main" java.lang.UnsatisfiedLinkError: no clang in java.library.path: /home/jenkins/jchau_ffi/jextract/build/jextract/runtime/lib
        at java.base/java.lang.ClassLoader.loadLibrary(ClassLoader.java:2434)
        at java.base/java.lang.Runtime.loadLibrary0(Runtime.java:818)
        at java.base/java.lang.System.loadLibrary(System.java:1993)
        at LibTest.<clinit>(LibTest.java:3)
-bash-5.0$
-bash-5.0$ ../jdk17_hotspot_aix_ppc64/bin/java  LibTest2
Exception in thread "main" java.lang.UnsatisfiedLinkError: 
/home/jenkins/jchau_ffi/jextract/build/jextract/runtime/lib/libclang.a: /home/jenkins/jchau_ffi/jextract/build/jextract/runtime/lib/libclang.a,
LIBPATH=/home/jenkins/jchau_ffi/jdk17_hotspot_aix_ppc64/lib/server:/home/jenkins/jchau_ffi/jdk17_hotspot_aix_ppc64/lib:/home/jenkins/jchau_ffi/jdk17_hotspot_aix_ppc64/../lib, LD_LIBRARY_PATH= : 
0509-022 Cannot load module /home/jenkins/jchau_ffi/jextract/build/jextract/runtime/lib/libclang.a.
        0509-026 System error: Cannot run a file that does not have a valid format.
        at java.base/jdk.internal.loader.NativeLibraries.load(Native Method)
        at java.base/jdk.internal.loader.NativeLibraries$NativeLibraryImpl.open(NativeLibraries.java:388)
        at java.base/jdk.internal.loader.NativeLibraries.loadLibrary(NativeLibraries.java:232)
        at java.base/jdk.internal.loader.NativeLibraries.loadLibrary(NativeLibraries.java:174)
        at java.base/java.lang.ClassLoader.loadLibrary(ClassLoader.java:2394)
        at java.base/java.lang.Runtime.load0(Runtime.java:755)
        at java.base/java.lang.System.load(System.java:1957)
        at LibTest2.<clinit>(LibTest2.java:3)

(2) JDK22/Hotspot:

-bash-5.0$ jdk22_hotspot_aix_ppc64/bin/java -version
Error: dl failure on line 533
Error: failed /home/jenkins/jchau_ffi/jdk22_hotspot_aix_ppc64/lib/server/libjvm.so,
because 0509-022 Cannot load module /home/jenkins/jchau_ffi/jdk22_hotspot_aix_ppc64/lib/server/libjvm.so.
        0509-150   Dependent module /usr/lib/libc++.a(shr2_64.o) could not be loaded.
        0509-152   Member shr2_64.o is not found in archive
        0509-022 Cannot load module /home/jenkins/jchau_ffi/jdk22_hotspot_aix_ppc64/lib/server/libjvm.so.
        0509-150   Dependent module /home/jenkins/jchau_ffi/jdk22_hotspot_aix_ppc64/lib/server/libjvm.so could not be loaded.
suchismith1993 commented 1 month ago

can you try loading with libclang.a(libclang.so.16) ?

ChengJin01 commented 1 month ago

can you try loading with libclang.a(libclang.so.16) ?

I already tried with libclang.a as above at https://github.com/eclipse-openj9/openj9/issues/19344#issuecomment-2254195103 (OpenJ9) & https://github.com/eclipse-openj9/openj9/issues/19344#issuecomment-2254197086 (Hotspot).

JasonFengJ9 commented 1 month ago

Exception in thread "main" java.lang.UnsatisfiedLinkError: /home/jenkins/jchau_ffi/jextract/build/jextract/runtime/lib/libclang.a: /home/jenkins/jchau_ffi/jextract/build/jextract/runtime/lib/libclang.a, LIBPATH=/home/jenkins/jchau_ffi/jdk17_hotspot_aix_ppc64/lib/server:/home/jenkins/jchau_ffi/jdk17_hotspot_aix_ppc64/lib:/home/jenkins/jchau_ffi/jdk17_hotspot_aix_ppc64/../lib, LD_LIBRARY_PATH= : 0509-022 Cannot load module /home/jenkins/jchau_ffi/jextract/build/jextract/runtime/lib/libclang.a. 0509-026 System error: Cannot run a file that does not have a valid format.

How was libclang.a generated (AIX OS/compiler version)? is there a RI version available?

ChengJin01 commented 1 month ago

How was libclang.a generated (AIX OS/compiler version)? is there a RI version available?

libclang.a (which is part of LLVM binaries on AIX / this is not libc.a) is extracted from https://github.com/llvm/llvm-project/releases/download/llvmorg-18.1.8/clang+llvm-18.1.8-powerpc64-ibm-aix-7.2.tar.xz which is required to generate the jextract tool as per https://github.com/openjdk/jextract (which has nothing to with RI).

JasonFengJ9 commented 1 month ago

How was libclang.a generated (AIX OS/compiler version)? is there a RI version available?

libclang.a (which is part of LLVM binaries on AIX / this is not libc.a) is extracted from https://github.com/llvm/llvm-project/releases/download/llvmorg-18.1.8/clang+llvm-18.1.8-powerpc64-ibm-aix-7.2.tar.xz which is required to generate the jextract tool as per https://github.com/openjdk/jextract (which has nothing to with RI).

Do we know if RI uses this library? maybe not? There are some earlier version LLVM binaries, is there a version working with OpenJ9/RI AIX Java?

ChengJin01 commented 1 month ago

Do we know if RI uses this library? maybe not?

This is irrelevant to how RI works but the jextract tool which requires LLVM binaries as everything with jextract is based on C LibClang API (LLVM) at https://github.com/openjdk/jextract: Building: jextract depends on the C libclang API. To build the jextract sources, the easiest option is to download LLVM binaries for your platform, which can be found here (version 13.0.0 is recommended). Both the jextract tool and the bindings it generates depend heavily on the Foreign Function & Memory API, so a suitable jdk 22 distribution is also required.

ChengJin01 commented 1 month ago

There are some earlier version LLVM binaries, is there a version working with OpenJ9/RI AIX Java?

I don't think it is related to the LLVM binaries but the loading mechanism in JDK on AIX.

e.g. for the test case loading /usr/lib/libc.a

public class LibTest_2 {
   static {
 System.load("/usr/lib/libc.a");
System.out.println("passed");
   }
    public static void main( String []args ) {
        System.out.println("lib loading test");
    }
}

with the same exception as above:

-bash-5.0$ ../jdk17_openj9_ffi_cmk_aix_ppc64/bin/javac  LibTest_2.java
-bash-5.0$ ../jdk17_openj9_ffi_cmk_aix_ppc64/bin/java  LibTest_2
Exception in thread "main" java.lang.UnsatisfiedLinkError:
Failed to load library ("/usr/ccs/lib/libc.a") 
0509-022 Cannot load module /usr/ccs/lib/libc.a.
        0509-026 System error: Cannot run a file that does not have a valid format.
        at java.base/jdk.internal.loader.NativeLibraries.load(Native Method)
        at java.base/jdk.internal.loader.NativeLibraries$NativeLibraryImpl.open(NativeLibraries.java:395)
        at java.base/jdk.internal.loader.NativeLibraries.loadLibrary(NativeLibraries.java:239)
        at java.base/jdk.internal.loader.NativeLibraries.loadLibrary(NativeLibraries.java:181)
        at java.base/java.lang.ClassLoader.loadLibrary(ClassLoader.java:1770)
        at java.base/java.lang.System.load(System.java:741)
        at LibTest_2.<clinit>(LibTest_2.java:3)

-bash-5.0$ ../jdk17_hotspot_aix_ppc64/bin/javac  LibTest_2.java
-bash-5.0$ ../jdk17_hotspot_aix_ppc64/bin/java  LibTest_2
Exception in thread "main" java.lang.UnsatisfiedLinkError: /usr/ccs/lib/libc.a: /usr/ccs/lib/libc.a,
LIBPATH=/home/jenkins/jchau_ffi/jdk17_hotspot_aix_ppc64/lib/server:/home/jenkins/jchau_ffi/jdk17_hotspot_aix_ppc64/lib:/home/jenkins/jchau_ffi/jdk17_hotspot_aix_ppc64/../lib,
LD_LIBRARY_PATH= :  0509-022 Cannot load module /usr/ccs/lib/libc.a.
        0509-026 System error: Cannot run a file that does not have a valid format.
        at java.base/jdk.internal.loader.NativeLibraries.load(Native Method)
        at java.base/jdk.internal.loader.NativeLibraries$NativeLibraryImpl.open(NativeLibraries.java:388)
        at java.base/jdk.internal.loader.NativeLibraries.loadLibrary(NativeLibraries.java:232)
        at java.base/jdk.internal.loader.NativeLibraries.loadLibrary(NativeLibraries.java:174)
        at java.base/java.lang.ClassLoader.loadLibrary(ClassLoader.java:2394)
        at java.base/java.lang.Runtime.load0(Runtime.java:755)
        at java.base/java.lang.System.load(System.java:1957)
        at LibTest_2.<clinit>(LibTest_2.java:3)
JasonFengJ9 commented 1 month ago

Exception in thread "main" java.lang.UnsatisfiedLinkError: 0509-026 System error: Cannot run a file that does not have a valid format.

It could be the incompatible library like https://www.ibm.com/support/pages/error-0509-022-cannot-load-module-librclibmratlso

com.urbancode.ds.subsys.deploy.license.LicenseService - Unable to connect to RCL server: null
2014-09-26 11:44:05,806 ERROR main com.urbancode.ds.rcl.RCLManager - Error connecting to RCL server
com.ibm.rcl.ibmratl.LicenseConfigurationException: java.lang.UnsatisfiedLinkError: rcl_ibmratl (0509-022 Cannot load module
/<UCD Server PATH>/lib/rcl/aix/64/librcl_ibmratl.so. 0509-026 System error: Cannot run a file that does not have a valid format.)
null
Cause
The issue can occur with one of these scenarios:
1. You have a 32 bit environment, but the 64 bit librcl_ibmratl.so is set in the java.library.path
2. You have a 64 bit AIX environment, but you use 32 bit JVM.
JasonFengJ9 commented 1 month ago

Trace point omrport.469-480 might indicate where the failure occurred. Otherwise printf() can be added around dlopen() within omr/port/aix/omrsl.c:omrsl_open_shared_library().

ChengJin01 commented 1 month ago

Traces output:

+38220  02:52:04.394205790 *0x0000000030010700 j9scar.113          Entry      >JVM_LoadLibrary(name=/usr/ccs/lib/libc.a)
+38221  02:52:04.394208210  0x0000000030010700 omrport.475         Entry      >omrsl_open_shared_library name=/usr/ccs/lib/libc.a, flags=0x2
+38222  02:52:04.394209058  0x0000000030010700 omrport.476         Event       omrsl_open_shared_library using mangledName /usr/ccs/lib/libc.a
+38235  02:52:04.396876841  0x0000000030010700 omrport.480         Event       omrsl_open_shared_library OS error message:       0509-022 Cannot load module /usr/ccs/lib/libc.a.
+38236          0509-026 System error: Cannot run a file that does not have a valid format.

which doesn't include any meaningful information with the library loading.

JasonFengJ9 commented 1 month ago

It seems dlopen() failed with error message 0509-026 System error: Cannot run a file that does not have a valid format.

What's the test output for OpenJ9 Java 11, and IBM Java 8 64/32 bit? Note: this issue was for JDK 17+, no behaviour change for JDK 8/11.

ChengJin01 commented 1 month ago

What's the test output for OpenJ9 Java 11, and IBM Java 8 64/32 bit?

$ jdk8_openj9_aix_ppc64/bin/java -version
openjdk version "1.8.0_422-internal"
OpenJDK Runtime Environment (build 1.8.0_422-internal-jenkins_2024_07_20_23_48-b00)
Eclipse OpenJ9 VM (build v0.46.0-release-6c99fa94be9, JRE 1.8.0 AIX ppc64-64-Bit Compressed References 20240721_116 (JIT enabled, AOT enabled)
OpenJ9   - 6c99fa94be9
OMR      - 840a9adba45
JCL      - a75ff73ce58 based on jdk8u422-b05)

-bash-5.0$ ../jdk8_openj9_aix_ppc64/bin/javac  LibTest_2.java
-bash-5.0$ ../jdk8_openj9_aix_ppc64/bin/java  LibTest_2
Exception in thread "main" java.lang.UnsatisfiedLinkError: /usr/lib/libc.a (    0509-022 Cannot load module /usr/lib/libc.a.
        0509-103   The module has an invalid magic number.)
        at java.lang.ClassLoader.loadLibraryWithPath(ClassLoader.java:1473)
        at java.lang.System.load(System.java:592)
        at LibTest_2.<clinit>(LibTest_2.java:3)

$jdk11_openj9_aix_ppc64/bin/java -version
openjdk version "11.0.24-internal" 2024-07-16
OpenJDK Runtime Environment (build 11.0.24-internal+0-adhoc.jenkins.BuildJDK11ppc64aixNightly)
Eclipse OpenJ9 VM (build master-487b005e140, JRE 11 AIX ppc64-64-Bit Compressed References 20240726_844 (JIT enabled, AOT enabled)
OpenJ9   - 487b005e140
OMR      - d18121d17c5
JCL      - a2937c555a6 based on jdk-11.0.24+8)

-bash-5.0$ ../jdk11_openj9_aix_ppc64/bin/javac  LibTest_2.java
-bash-5.0$ ../jdk11_openj9_aix_ppc64/bin/java  LibTest_2
Exception in thread "main" java.lang.UnsatisfiedLinkError: /usr/lib/libc.a (    0509-022 Cannot load module /usr/lib/libc.a.
        0509-103   The module has an invalid magic number.)
        at java.base/java.lang.ClassLoader.loadLibraryWithPath(ClassLoader.java:1733)
        at java.base/java.lang.System.load(System.java:702)
        at LibTest_2.<clinit>(LibTest_2.java:3)

Based on the result on JDK8/11, it seems JDK8+ never successfully loads a .a file.

ChengJin01 commented 1 month ago

Looking at https://download.boulder.ibm.com/ibmdl/pub/software/dw/aix/es-aix_ll.pdf

Archive Files are composite objects, which usually contain object files. On AIX, archives can
contain shared objects, import files, or other kinds of members. By convention, the name of an
archive usually ends with .a, but the magic number of a file (that is, the first few bytes of the file)
is used to determine whether a file is an archive or not.

which explain why JDK8/11 failed to load .a file (as it doesn't recognize the magic number of the .a file)

Looking at the following explanation of the .a file:

In AIX, a shared object can be an archive member of a shared library. Conversely, a shared
library on AIX can be a single shared object module, or an archive of one or more objects, of
which, at least one is shared. By industry convention though, any file ending in .a is usually an
archive, while a file ending in .so is a shared object. Typically, system libraries in AIX are
archives consisting of some shared and some non-shared objects. AIX accepts both .a and .so as
shared library suffixes, though precedence is given to .so over .a libraries.

Note that dlopen() on AIX also allows loading of archive members of shared libraries via
the RTLD_MEMBER flag

In theory, dlopen should be able to load .a file given it is kind of shared library but simply specifying RTLD_LAZY won't help to get this working. Instead, we might need to have RTLD_MEMBER set in the flags of dlopen to load the archive file as per https://www.ibm.com/docs/en/aix/7.1?topic=d-dlopen-subroutine:

RTLD_MEMBER The dlopen subroutine can be used to load a module that is a member of an archive.
The L_LOADMEMBER flag is used when the load subroutine is called.
The module name FilePath names the archive and archive member according to the rules outlined in the load subroutine.
JasonFengJ9 commented 1 month ago

Based on the result on JDK8/11, it seems JDK8+ never successfully loads a .a file.

No, this is incorrect. An internal problem report item (TLTR workitem 143601 comment9) has a testcase loading libsystemInfo.a successfully.

ChengJin01 commented 1 month ago

I double-checked libsystemInfo.a from the internal item on JDK17/22 which works good.

libsystemInfo.a:
                        ***Loader Section***
                        ***Loader Symbol Table Information***
[Index]      Value      Scn     IMEX Sclass   Type           IMPid Name
[0]     0x00000000    undef      IMP     DS EXTref libc.a(shr_64.o) rand
[1]     0x110000350    .data     EXP     DS SECdef        [noIMid] Java_com_sun_management_mbeans_loading_SystemInfoUseNativeLib_getRandom

That means libsystemInfo.a is simply a single shared object module while libc.a or libclang.a is an archive of one or more objects of which at least one is shared, as explained above at https://download.boulder.ibm.com/ibmdl/pub/software/dw/aix/es-aix_ll.pdf. If so, the existing code with dlopen might have problem with loading this type of archive file (as the shared library). That being said, it only supports the standard .a file (pretty much like `.so which only contains a single shared object) but fails to support a real archive file (containing many objects mixed with shared and non-shared)

In addition, I tried to add RTLD_MEMBER the flags of dlopen but still ended up the same failure (probably not added in the right place).

ChengJin01 commented 1 month ago

As discussed with @JasonFengJ9 offline, it turns out that the issue with library loading in the jextract generation is totally different from what needs to be addressed in this issue. So I agree to close this issue and a new issue will be created for with we encountered on AIX.

keithc-ca commented 1 month ago

My understanding is that on AIX, the subject of a dlopen call is expected to be a shared library or an archive of shareable objects, not a single object file.

ChengJin01 commented 1 month ago

My understanding is that on AIX, the subject of a dlopen call is expected to be a shared library or an archive of shareable objects, not a single object file.

The truth is that the existing code in OpenJ9/OMR dealing with dlopen fails to load such an archive of shareable objects (liked libc.a) unless there is some issue with that code.