quarkusio / quarkus

Quarkus: Supersonic Subatomic Java.
https://quarkus.io
Apache License 2.0
13.85k stars 2.7k forks source link

Some classes loaders are not ARM compatible #43415

Open rcjverhoef opened 2 months ago

rcjverhoef commented 2 months ago

Describe the bug

I run on apple silicon with a kotlin code base. I get an exception that comes down to fat file, but missing compatible architecture (have 'i386,x86_64', need 'arm64e' or 'arm64').

The JVM I run is ARM based. The freaky thing: I call a library A that calls another library B, where this breaks with above error. However, if I do the identical call to library B from my code it does not break.

Expected behavior

When on apple silicon, everything should work the same in different class loaders

Actual behavior

when running without the @QuarkusTest annotation:

Roccos-Macbook
org.apache.ranger.admin.client.RangerAdminRESTClient@2ceb80a1

when running with the @QuarkusTest annotation:

/Users/roccoverhoef/Library/Caches/JNA/temp/jna15787217153353432139.tmp: dlopen(/Users/roccoverhoef/Library/Caches/JNA/temp/jna15787217153353432139.tmp, 0x0001): tried: '/Users/roccoverhoef/Library/Caches/JNA/temp/jna15787217153353432139.tmp' (fat file, but missing compatible architecture (have 'i386,x86_64', need 'arm64e' or 'arm64')), '/System/Volumes/Preboot/Cryptexes/OS/Users/roccoverhoef/Library/Caches/JNA/temp/jna15787217153353432139.tmp' (no such file), '/Users/roccoverhoef/Library/Caches/JNA/temp/jna15787217153353432139.tmp' (fat file, but missing compatible architecture (have 'i386,x86_64', need 'arm64e' or 'arm64'))
java.lang.UnsatisfiedLinkError: /Users/roccoverhoef/Library/Caches/JNA/temp/jna15787217153353432139.tmp: dlopen(/Users/roccoverhoef/Library/Caches/JNA/temp/jna15787217153353432139.tmp, 0x0001): tried: '/Users/roccoverhoef/Library/Caches/JNA/temp/jna15787217153353432139.tmp' (fat file, but missing compatible architecture (have 'i386,x86_64', need 'arm64e' or 'arm64')), '/System/Volumes/Preboot/Cryptexes/OS/Users/roccoverhoef/Library/Caches/JNA/temp/jna15787217153353432139.tmp' (no such file), '/Users/roccoverhoef/Library/Caches/JNA/temp/jna15787217153353432139.tmp' (fat file, but missing compatible architecture (have 'i386,x86_64', need 'arm64e' or 'arm64'))
    at java.base/jdk.internal.loader.NativeLibraries.load(Native Method)
    at java.base/jdk.internal.loader.NativeLibraries$NativeLibraryImpl.open(NativeLibraries.java:331)
    at java.base/jdk.internal.loader.NativeLibraries.loadLibrary(NativeLibraries.java:197)
    at java.base/jdk.internal.loader.NativeLibraries.loadLibrary(NativeLibraries.java:139)
    at java.base/java.lang.ClassLoader.loadLibrary(ClassLoader.java:2418)
    at java.base/java.lang.Runtime.load0(Runtime.java:852)
    at java.base/java.lang.System.load(System.java:2025)
    at com.sun.jna.Native.loadNativeDispatchLibraryFromClasspath(Native.java:1018)
    at com.sun.jna.Native.loadNativeDispatchLibrary(Native.java:988)
    at com.sun.jna.Native.<clinit>(Native.java:195)
    at com.kstruct.gethostname4j.Hostname$UnixCLibrary.<clinit>(Hostname.java:12)
    at com.kstruct.gethostname4j.Hostname.getHostname(Hostname.java:30)
    at HostnameTest.testHostname(HostnameTest.kt:11)
    at java.base/java.lang.reflect.Method.invoke(Method.java:580)
    at io.quarkus.test.junit.QuarkusTestExtension.runExtensionMethod(QuarkusTestExtension.java:1018)
    at io.quarkus.test.junit.QuarkusTestExtension.interceptTestMethod(QuarkusTestExtension.java:832)
    at java.base/java.util.ArrayList.forEach(ArrayList.java:1596)
    at java.base/java.util.ArrayList.forEach(ArrayList.java:1596)

How to Reproduce?

This might be a bit cumbersome.

The problem is with apache-ranger. This for some reason calls library kstruct to get a hostname.

I have below test-case. This breaks on call RangerAdminRESTClient(), which does the same call as my code does one line before. If I remove @QuarkusTest, everything works. This looks like the same behavior I see when running my app.

import com.kstruct.gethostname4j.Hostname
import io.quarkus.test.junit.QuarkusTest
import org.apache.ranger.admin.client.RangerAdminRESTClient
import org.junit.jupiter.api.Test

@QuarkusTest
class TestKsctruc {
    @Test
    fun test() {
        val x = Hostname.getHostname()
        println(x)
        val client = RangerAdminRESTClient()
    }
}

Output of uname -a or ver

23.6.0 Darwin Kernel Version 23.6.0: Mon Jul 29 21:14:30 PDT 2024; root:xnu-10063.141.2~1/RELEASE_ARM64_T6030 arm64

Output of java -version

openjdk version "23" 2024-09-17 OpenJDK Runtime Environment Homebrew (build 23) OpenJDK 64-Bit Server VM Homebrew (build 23, mixed mode, sharing)

Quarkus version or git rev

3.10.2

Build tool (ie. output of mvnw --version or gradlew --version)

------------------------------------------------------------ Gradle 8.7 ------------------------------------------------------------ Build time: 2024-03-22 15:52:46 UTC Revision: 650af14d7653aa949fce5e886e685efc9cf97c10 Kotlin: 1.9.22 Groovy: 3.0.17 Ant: Apache Ant(TM) version 1.10.13 compiled on January 4 2023 JVM: 23 (Homebrew 23) OS: Mac OS X 14.6.1 aarch64

Additional information

No response

quarkus-bot[bot] commented 2 months ago

/cc @geoand (kotlin)

geoand commented 2 months ago

Any chance you can attach a sample project that exhibits the problem?

Thanks

rcjverhoef commented 2 months ago

here is a quick-and-dirty repo I cobbled together: https://github.com/rcjverhoef/quarkus-class-loading-issue/tree/main/demo

In here I have just one test case. When annotation @QuarkusTest is removed the test passes. When you add it back, it fails on the first call to the library on my machine.

This is different from the behavior I have in our bigger repo btw. There, the direct call to the library in the test works, but the second call via the RangerRESTClientUtils instantiation is the one causing the problems. I can see debugging through it that there are 2 classloaders in play there. Can't give you that repo as an example sadly since it contains company secret sauce as well.

I have tried forcing an overridden class I made myself, which works from all my classes, but that doesn't seem to get picked up for any 3rd party library I included (in that case, I can see it is not in the classloader that is used for those libraries).

My current work-around is to repackage the offending jar locally with a file replaced and using that jar instead.

geoand commented 2 months ago

Very weird indeed.

Unfortunately I can't reproduce the issue since I don't have an ARM machine. Can you perhaps enhance the description with the entire stacktrace?

rcjverhoef commented 2 months ago

@geoand updated the Actual Behavior with stacktrace as requested.

To be fair, I am assuming this is an apple silicon issue based on the error message and what I found on Google/ChatGPT.

geoand commented 2 months ago

@dmlloyd any ideas on what could be going on here?

dmlloyd commented 2 months ago

Is there a container involved? Or, could we be detecting and using an x64 JDK for some reason when @QuarkusTest is used?

This isn't likely to be classloader related; more likely we've confused JNA somehow.

geoand commented 2 months ago

Or, could we be detecting and using an x64 JDK for some reason when @QuarkusTest is used?

I don't see how, but then again I don't have an ARM machine, so I can't say for certain.

dmlloyd commented 2 months ago

I believe that the string have 'i386,x86_64' shows that the executable (java) is an x64 executable.

Try running this command: /usr/libexec/java_home -V and see what JDKs are listed. Perhaps you have an x64 one in there that is getting used sometimes.

rcjverhoef commented 1 month ago

I have a range of them installed. Here is the output from your requested command:

➜  ~ /usr/libexec/java_home -V
Matching Java Virtual Machines (4):
    21.0.4 (arm64) "Amazon.com Inc." - "Amazon Corretto 21" /Users/roccoverhoef/Library/Java/JavaVirtualMachines/corretto-21.0.4/Contents/Home
    11.0.24 (arm64) "Amazon.com Inc." - "Amazon Corretto 11" /Users/roccoverhoef/Library/Java/JavaVirtualMachines/corretto-11.0.24/Contents/Home
    1.8.0_422 (arm64) "Amazon" - "Amazon Corretto 8" /Users/roccoverhoef/Library/Java/JavaVirtualMachines/corretto-1.8.0_422/Contents/Home
    1.8.0_292 (x86_64) "AdoptOpenJDK" - "AdoptOpenJDK 8" /Library/Java/JavaVirtualMachines/adoptopenjdk-8.jdk/Contents/Home
/Users/roccoverhoef/Library/Java/JavaVirtualMachines/corretto-21.0.4/Contents/Home

If I the ones I managed via jenv:

➜  ~ jenv versions | awk '{print $1}' | while read version; do echo "$version - $(jenv prefix $version)"; done

system - /opt/homebrew
1.8 - /Users/roccoverhoef/.jenv/versions/1.8
1.8.0.292 - /Users/roccoverhoef/.jenv/versions/1.8.0.292
11 - /Users/roccoverhoef/.jenv/versions/11
11.0 - /Users/roccoverhoef/.jenv/versions/11.0
11.0.24 - /Users/roccoverhoef/.jenv/versions/11.0.24
jenv: version `*' not installed
* - 
21.0 - /Users/roccoverhoef/.jenv/versions/21.0
21.0.4 - /Users/roccoverhoef/.jenv/versions/21.0.4
23 - /Users/roccoverhoef/.jenv/versions/23
openjdk64-1.8.0.292 - /Users/roccoverhoef/.jenv/versions/openjdk64-1.8.0.292
openjdk64-11.0.24 - /Users/roccoverhoef/.jenv/versions/openjdk64-11.0.24
openjdk64-21.0.4 - /Users/roccoverhoef/.jenv/versions/openjdk64-21.0.4
openjdk64-23 - /Users/roccoverhoef/.jenv/versions/openjdk64-23

It still is very weird to me that the same fails or passes depending if I added the @QuarkusTest annotation.

gsmet commented 1 month ago

@rcjverhoef could you log Runtime.version() and see how it goes?

dmlloyd commented 1 month ago

Another useful test could be to check the output of io.smallrye.common.cpu.CPU#host().

rcjverhoef commented 1 month ago

@gsmet, @dmlloyd

printing Runtime.version() and io.smallrye.common.cpu.CPU.host() gives me the same results with or without QuarkusTest annotation:

21.0.4+7-LTS
aarch64

Edit: I also added printing the class loader. Here I get a difference when using the @QuarkusTest or not.

With @QuarkusTest, which fails with fat file, but missing compatible architecture (have 'i386,x86_64', need 'arm64e' or 'arm64'):

21.0.4+7-LTS
aarch64
QuarkusClassLoader:Quarkus Runtime ClassLoader: TEST for HostnameTest (QuarkusTest) restart no:0@57cb70be

Without @QuarkusTest, which works:

21.0.4+7-LTS
aarch64
jdk.internal.loader.ClassLoaders$AppClassLoader@1cf4f579

My test cases now looks like:

import com.kstruct.gethostname4j.Hostname
import io.quarkus.test.junit.QuarkusTest
import org.apache.ranger.admin.client.RangerAdminRESTClient
import org.junit.jupiter.api.Test

@QuarkusTest
class HostnameTest {
    @Test
    fun testHostname() {
        println(Runtime.version())
        println(io.smallrye.common.cpu.CPU.host())
        println(this::class.java.classLoader)
        val x = Hostname.getHostname()
        println(x)
        val client = RangerAdminRESTClient()
        println(client.toString())
    }
}

I pushed to my sample repo mentioned above so you can see what I'm doing.

dmlloyd commented 1 month ago

Could you try tuning the log level of the com.sun.jna category to TRACE?

rcjverhoef commented 1 month ago

@dmlloyd : that's a good suggestion, I should have thought of that. The difference is that without the annotation the jna looks and find /com/sun/jna/darwin-aarch64/libjnidispatch.jnilib and with the Quarkus class loader it looks and fails to find /com/sun/jna/darwin/libjnidispatch.jnilib.

Had some issues to get in working and change the code a bit too.

interface CLibrary : Library {
    fun printf(format: String, vararg args: Any): Int
}

//@QuarkusTest
class HostnameTest {
    @Test
    fun testHostname() {
        System.setProperty("jna.debug_load", "true");
        System.setProperty("jna.debug_load.jna", "true");
        val libc = Native.load("c", CLibrary::class.java) as CLibrary
    }
}

With the @QuarkusTest commented out, I can see it looking for and find aarch-64

2024-10-09 23:58:11,779 INFO  [com.sun.jna.Native] (Test worker) Looking in classpath from jdk.internal.loader.ClassLoaders$AppClassLoader@1cf4f579 for /com/sun/jna/darwin-aarch64/libjnidispatch.jnilib
2024-10-09 23:58:11,783 INFO  [com.sun.jna.Native] (Test worker) Found library resource at jar:file:/Users/roccoverhoef/.gradle/caches/modules-2/files-2.1/net.java.dev.jna/jna/5.8.0/3551d8d827e54858214107541d3aff9c615cb615/jna-5.8.0.jar!/com/sun/jna/darwin-aarch64/libjnidispatch.jnilib
2024-10-09 23:58:11,783 INFO  [com.sun.jna.Native] (Test worker) Extracting library to /Users/roccoverhoef/Library/Caches/JNA/temp/jna416912632756075848.tmp
2024-10-09 23:58:11,786 INFO  [com.sun.jna.Native] (Test worker) Trying /Users/roccoverhoef/Library/Caches/JNA/temp/jna416912632756075848.tmp
2024-10-09 23:58:11,952 INFO  [com.sun.jna.Native] (Test worker) Found jnidispatch at /Users/roccoverhoef/Library/Caches/JNA/temp/jna416912632756075848.tmp
2024-10-09 23:58:11,958 INFO  [com.sun.jna.NativeLibrary] (Test worker) Looking for library 'c'
2024-10-09 23:58:11,959 INFO  [com.sun.jna.NativeLibrary] (Test worker) Adding paths from jna.library.path: null
2024-10-09 23:58:11,959 INFO  [com.sun.jna.NativeLibrary] (Test worker) Trying libc.dylib
2024-10-09 23:58:11,959 INFO  [com.sun.jna.NativeLibrary] (Test worker) Found library 'c' at libc.dylib

But with the @QuarkusTest annotation turned on, I get below for it blows up.:

2024-10-09 23:57:49,470 INFO  [com.sun.jna.Native] (Test worker) Looking in classpath from QuarkusClassLoader:Quarkus Base Runtime ClassLoader: TEST for HostnameTest (QuarkusTest)@40cb698e for /com/sun/jna/darwin/libjnidispatch.jnilib
2024-10-09 23:57:49,471 INFO  [com.sun.jna.Native] (Test worker) Found library resource at jar:file:///Users/roccoverhoef/.gradle/caches/modules-2/files-2.1/org.elasticsearch/jna/5.5.0/ade077cbb2618a18bfc6c335413b2b7163d97601/jna-5.5.0.jar!/com/sun/jna/darwin/libjnidispatch.jnilib
2024-10-09 23:57:49,472 INFO  [com.sun.jna.Native] (Test worker) Extracting library to /Users/roccoverhoef/Library/Caches/JNA/temp/jna8638381557265410909.tmp
2024-10-09 23:57:49,473 INFO  [com.sun.jna.Native] (Test worker) Trying /Users/roccoverhoef/Library/Caches/JNA/temp/jna8638381557265410909.tmp
dmlloyd commented 1 month ago

Aha, two different versions of JNA on the class path. One in org.elastichsearch and one in net.java.dev.jna. The classes are shaded, however the native library has the same name, and first one wins. Not good.