Homebrew / homebrew-core

🍻 Default formulae for the missing package manager for macOS (or Linux)
https://brew.sh
BSD 2-Clause "Simplified" License
13.58k stars 12.32k forks source link

openjdk: dlopen/rpath issues when loading an internal openjdk .so after loading an external .so #111068

Closed deejgregor closed 1 year ago

deejgregor commented 1 year ago

brew gist-logs <formula> link OR brew config AND brew doctor output

brew config

$ brew config
HOMEBREW_VERSION: 3.6.1-72-gdf9f878
ORIGIN: https://github.com/Homebrew/brew
HEAD: df9f8786cac3c4755c186ac5bbe63e2198862c20
Last commit: 2 hours ago
Core tap ORIGIN: https://github.com/Homebrew/homebrew-core
Core tap HEAD: 934055f356a95bb0b9221a9f5a8bb45dc8e476a3
Core tap last commit: 44 minutes ago
Core tap branch: master
HOMEBREW_PREFIX: /opt/homebrew
HOMEBREW_CASK_OPTS: []
HOMEBREW_EDITOR: vi
HOMEBREW_MAKE_JOBS: 10
Homebrew Ruby: 2.6.8 => /System/Library/Frameworks/Ruby.framework/Versions/2.6/usr/bin/ruby
CPU: 10-core 64-bit arm_firestorm_icestorm
Clang: 14.0.0 build 1400
Git: 2.37.0 => /Library/Developer/CommandLineTools/usr/bin/git
Curl: 7.79.1 => /usr/bin/curl
macOS: 12.6-arm64
CLT: 14.0.0.0.1.1661618636
Xcode: N/A
Rosetta 2: false
$

brew doctor

$ brew doctor
Your system is ready to brew.
$  

Verification

What were you trying to do (and why)?

Use the pyroscope java agent with opennms for continuous profiling to debug application issues. This loads the async-profiler and its .so to get profiling information out of the JVM. I have the same problem when I try to load async-profiler directly. I'm doing this on an M1 Mac using Homebrew OpenJDK.

I have a simple reproducer, ManagementTest and all of the details I'll share will be from there.

If I load either the pyroscope agent or the async-profiler into Homebrew-built OpenJDK 11.0.16.1 or 18.0.2.1, I get this failure:

Exception in thread "main" java.lang.UnsatisfiedLinkError: no management in java.library.path: [/Users/dgregor/Library/Java/Extensions, /Library/Java/Extensions, /Network/Library/Java/Extensions, /System/Library/Java/Extensions, /usr/lib/java, .]
    at java.base/java.lang.ClassLoader.loadLibrary(ClassLoader.java:2673)
    at java.base/java.lang.Runtime.loadLibrary0(Runtime.java:830)
    at java.base/java.lang.System.loadLibrary(System.java:1873)
    at java.management/java.lang.management.ManagementFactory.lambda$static$8(ManagementFactory.java:1020)
    at java.base/java.security.AccessController.doPrivileged(Native Method)
    at java.management/java.lang.management.ManagementFactory.<clinit>(ManagementFactory.java:1019)
    at ManagementTest.main(ManagementTest.java:6)

What happened (include all command output)?

(note that I had to hit ctrl-C to terminate the JVM when loading the pyroscope agent)

$ /opt/homebrew/opt/openjdk@11/bin/java -agentpath:/Users/dgregor/Downloads/async-profiler-2.8.3-macos/build/libasyncProfiler.so=start,event=cpu,file=profile.html -cp . ManagementTest
Profiling started
Exception in thread "main" java.lang.UnsatisfiedLinkError: no management in java.library.path: [/Users/dgregor/Library/Java/Extensions, /Library/Java/Extensions, /Network/Library/Java/Extensions, /System/Library/Java/Extensions, /usr/lib/java, .]
    at java.base/java.lang.ClassLoader.loadLibrary(ClassLoader.java:2673)
    at java.base/java.lang.Runtime.loadLibrary0(Runtime.java:830)
    at java.base/java.lang.System.loadLibrary(System.java:1873)
    at java.management/java.lang.management.ManagementFactory.lambda$static$8(ManagementFactory.java:1020)
    at java.base/java.security.AccessController.doPrivileged(Native Method)
    at java.management/java.lang.management.ManagementFactory.<clinit>(ManagementFactory.java:1019)
    at ManagementTest.main(ManagementTest.java:6)
$ /opt/homebrew/opt/openjdk@18/bin/java -agentpath:/Users/dgregor/Downloads/async-profiler-2.8.3-macos/build/libasyncProfiler.so=start,event=cpu,file=profile.html -cp . ManagementTest
Profiling started
Exception in thread "main" java.lang.UnsatisfiedLinkError: no management in system library path: /opt/homebrew/Cellar/openjdk/18.0.2.1/libexec/openjdk.jdk/Contents/Home/lib
    at java.base/java.lang.ClassLoader.loadLibrary(ClassLoader.java:2408)
    at java.base/java.lang.Runtime.loadLibrary0(Runtime.java:848)
    at java.base/java.lang.System.loadLibrary(System.java:2015)
    at java.management/java.lang.management.ManagementFactory.lambda$loadNativeLib$8(ManagementFactory.java:1025)
    at java.base/java.security.AccessController.doPrivileged(AccessController.java:318)
    at java.management/java.lang.management.ManagementFactory.loadNativeLib(ManagementFactory.java:1024)
    at java.management/java.lang.management.ManagementFactory.<clinit>(ManagementFactory.java:1019)
    at ManagementTest.main(ManagementTest.java:6)
$ /opt/homebrew/opt/openjdk@11/bin/java -javaagent:/Users/dgregor/Downloads/pyroscope.jar -cp . ManagementTest
2022-09-18 16:42:07.026 [INFO] We recommend specifying application name via env variable PYROSCOPE_APPLICATION_NAME
2022-09-18 16:42:07.034 [INFO] For now we chose the name for you and it's javaspy.VeRo3kppTE6W-aS19qBBug
2022-09-18 16:42:07.034 [WARN] PYROSCOPE_SERVER_ADDRESS is not defined, using http://localhost:4040
2022-09-18 16:42:07.707 [INFO] Profiling started
Exception in thread "main" java.lang.UnsatisfiedLinkError: no management in java.library.path: [/Users/dgregor/Library/Java/Extensions, /Library/Java/Extensions, /Network/Library/Java/Extensions, /System/Library/Java/Extensions, /usr/lib/java, .]
    at java.base/java.lang.ClassLoader.loadLibrary(ClassLoader.java:2673)
    at java.base/java.lang.Runtime.loadLibrary0(Runtime.java:830)
    at java.base/java.lang.System.loadLibrary(System.java:1873)
    at java.management/java.lang.management.ManagementFactory.lambda$static$8(ManagementFactory.java:1020)
    at java.base/java.security.AccessController.doPrivileged(Native Method)
    at java.management/java.lang.management.ManagementFactory.<clinit>(ManagementFactory.java:1019)
    at ManagementTest.main(ManagementTest.java:6)
^CProfiling stopped after 0 seconds. No dump options specified
$ /opt/homebrew/opt/openjdk@18/bin/java -javaagent:/Users/dgregor/Downloads/pyroscope.jar -cp . ManagementTest
2022-09-18 16:42:13.893 [INFO] We recommend specifying application name via env variable PYROSCOPE_APPLICATION_NAME
2022-09-18 16:42:13.911 [INFO] For now we chose the name for you and it's javaspy.GYdlDc5eQD-4rP8oy7ZY8g
2022-09-18 16:42:13.911 [WARN] PYROSCOPE_SERVER_ADDRESS is not defined, using http://localhost:4040
2022-09-18 16:42:14.227 [INFO] Profiling started
Exception in thread "main" java.lang.UnsatisfiedLinkError: no management in system library path: /opt/homebrew/Cellar/openjdk/18.0.2.1/libexec/openjdk.jdk/Contents/Home/lib
    at java.base/java.lang.ClassLoader.loadLibrary(ClassLoader.java:2408)
    at java.base/java.lang.Runtime.loadLibrary0(Runtime.java:848)
    at java.base/java.lang.System.loadLibrary(System.java:2015)
    at java.management/java.lang.management.ManagementFactory.lambda$loadNativeLib$8(ManagementFactory.java:1025)
    at java.base/java.security.AccessController.doPrivileged(AccessController.java:318)
    at java.management/java.lang.management.ManagementFactory.loadNativeLib(ManagementFactory.java:1024)
    at java.management/java.lang.management.ManagementFactory.<clinit>(ManagementFactory.java:1019)
    at ManagementTest.main(ManagementTest.java:6)
^CProfiling stopped after 4 seconds. No dump options specified
$

If you run with DYLD_PRINT_APIS=1 (see this page for details), you get to see the detailed error is that libmanagement.dylib fails to load libjvm.dylib: Library not loaded: '@rpath/libjvm.dylib'. Note that libjvm.dylib is in lib/server, not directly in lib:

1969:dyld[1473]:       dlopen(libmanagement.dylib) => NULL, 'dlopen(/opt/homebrew/Cellar/openjdk@11/11.0.16.1/libexec/openjdk.jdk/Contents/Home/lib/libmanagement.dylib, 0x0001): Library not loaded: '@rpath/libjvm.dylib'

Details:

$ DYLD_PRINT_APIS=1 /opt/homebrew/opt/openjdk@11/bin/java -agentpath:/Users/dgregor/Downloads/async-profiler-2.8.3-macos/build/libasyncProfiler.so=start,event=cpu,file=profile.html -cp . ManagementTest 2>&1 | egrep -n 'dlopen.*(libasyncProfiler|libmanagement)'
143:dyld[1473]: dlopen("/Users/dgregor/Downloads/async-profiler-2.8.3-macos/build/libasyncProfiler.so", 0x00000001)
146:dyld[1473]:       dlopen(libasyncProfiler.so) => 0x2089f7360
150:dyld[1473]: dlopen("/Users/dgregor/Downloads/async-profiler-2.8.3-macos/build/libasyncProfiler.so", 0x00000081)
151:dyld[1473]:       dlopen(libasyncProfiler.so) => 0x2089f7360
1439:dyld[1473]: dlopen("/Users/dgregor/Downloads/async-profiler-2.8.3-macos/build/libasyncProfiler.so", 0x00000011)
1440:dyld[1473]:       dlopen(libasyncProfiler.so) => 0x2089f7360
1968:dyld[1473]: dlopen("/opt/homebrew/Cellar/openjdk@11/11.0.16.1/libexec/openjdk.jdk/Contents/Home/lib/libmanagement.dylib", 0x00000001)
1969:dyld[1473]:       dlopen(libmanagement.dylib) => NULL, 'dlopen(/opt/homebrew/Cellar/openjdk@11/11.0.16.1/libexec/openjdk.jdk/Contents/Home/lib/libmanagement.dylib, 0x0001): Library not loaded: '@rpath/libjvm.dylib'
1972:dyld[1473]: dlerror() => 'dlopen(/opt/homebrew/Cellar/openjdk@11/11.0.16.1/libexec/openjdk.jdk/Contents/Home/lib/libmanagement.dylib, 0x0001): Library not loaded: '@rpath/libjvm.dylib'
1977:dyld[1473]: dlopen("/opt/homebrew/Cellar/openjdk@11/11.0.16.1/libexec/openjdk.jdk/Contents/Home/lib/libmanagement.jnilib", 0x00000001)
1978:dyld[1473]:       dlopen(libmanagement.jnilib) => NULL, 'dlopen(/opt/homebrew/Cellar/openjdk@11/11.0.16.1/libexec/openjdk.jdk/Contents/Home/lib/libmanagement.jnilib, 0x0001): tried: '/opt/homebrew/Cellar/openjdk@11/11.0.16.1/libexec/openjdk.jdk/Contents/Home/lib/libmanagement.jnilib' (no such file)'
1979:dyld[1473]: dlerror() => 'dlopen(/opt/homebrew/Cellar/openjdk@11/11.0.16.1/libexec/openjdk.jdk/Contents/Home/lib/libmanagement.jnilib, 0x0001): tried: '/opt/homebrew/Cellar/openjdk@11/11.0.16.1/libexec/openjdk.jdk/Contents/Home/lib/libmanagement.jnilib' (no such file)'
$

What did you expect to happen?

The application starts up properly with the agent setup.

It does if I use a patched copy of async-profiler with the patch in this issue: https://github.com/jvm-profiling-tools/async-profiler/issues/647 (see more information the reproduction section).

See also: https://github.com/Homebrew/homebrew-core/issues/66953

Step-by-step reproduction instructions (by running brew commands)

I'm going to do this with `async-profiler` because it is a bit simpler than Pyroscope (Pyroscope includes a patched version of `async-profiler`.

1. Download [async-profiler-2.8.3-macos.zip](https://github.com/jvm-profiling-tools/async-profiler/releases/download/v2.8.3/async-profiler-2.8.3-macos.zip) and extract it.
2. Create ManagementTest.java with the contents below.
3. Compile: `javac ManagementTest.java`
4. Run (adjust paths to JDK and `libasyncProfiler.so` as appropriate): `/opt/homebrew/opt/openjdk@11/bin/java -agentpath:/Users/dgregor/Downloads/async-profiler-2.8.3-macos/build/libasyncProfiler.so=start,event=cpu,file=profile.html -cp . ManagementTest`
5. Run the same thing, but with the environment variable DYLD_PRINT_APIS=1 set (see the "What happened" section above).

## ManagementTest.java

import java.lang.management.ManagementFactory;
import java.lang.management.RuntimeMXBean;

public class ManagementTest {
    public static void main(String[] argv) {
        final RuntimeMXBean runtime = ManagementFactory.getRuntimeMXBean();
        System.out.println("it worked");
    }
}

If I apply this workaround from #647: Patch rpath for Homebrew JDK to async-profiler, things work fine. Pyroscope also works fine if I apply the same patch to their patched version of async-profiler and rebuild their .jar with updated versions of the .so and its .sha1 file:

(note: this example was hacked up from my notes/history, so it might not be 100% correct)

git clone https://github.com/jvm-profiling-tools/async-profiler.git
cd async-profiler
cd git show b5634b9d88304e64c0f77cf961119f240a979c38 > /tmp/rpath.patch
cd ..
git clone https://github.com/pyroscope-io/async-profiler.git async-profiler-pyroscope
cd async-profiler-pyroscope
git patch < /tmp/rpath.patch
make FAT_BINARY=true
cp build/libasyncProfiler.so libasyncProfiler-macos.so
sha1sum libasyncProfiler-macos.so > libasyncProfiler-macos.so.sha1
cp /Users/dgregor/Downloads/pyroscope.jar pyroscope-hacked.jar
zip -n '.so' -u pyroscope-hacked.jar libasyncProfiler-macos.so libasyncProfiler-macos.so.sha1

I'm unfortunately not familiar enough with the details of Homebrew builds and the JDK build infrastructure to easily dig into what could be tweaked in https://github.com/iMichka/homebrew-core/blob/master/Formula/openjdk@11.rb to fix this.

carlocab commented 1 year ago

There aren't really too many details to understand on our side, since the full build configuration is shown in the formula:

https://github.com/Homebrew/homebrew-core/blob/1f6aeb12ee98a7ea3de72318b200c3e6f3a2faa6/Formula/openjdk.rb#L72-L107

If the issue is the RPATH configuration, then that seems to be something inherited from the upstream defaults, since we don't pass any related flags. We can adjust it to something more appropriate, but we need to know what the correct way to pass it to the build is.

carlocab commented 1 year ago

Could you try rebuilding openjdk with this patch and see if it helps?

diff --git a/Formula/openjdk.rb b/Formula/openjdk.rb
index b04e356a6de..7a82f21c9b6 100644
--- a/Formula/openjdk.rb
+++ b/Formula/openjdk.rb
@@ -89,7 +89,7 @@ class Openjdk < Formula
     args += if OS.mac?
       %W[
         --enable-dtrace
-        --with-extra-ldflags=-headerpad_max_install_names
+        --with-extra-ldflags=-headerpad_max_install_names\ -Wl,-rpath,#{loader_path}/server
         --with-sysroot=#{MacOS.sdk_path}
       ]
     else
deejgregor commented 1 year ago

Could you try rebuilding openjdk with this patch and see if it helps?

Yup! I'll give it a whirl on both @11 and current. Give me a little bit of time. This one won't be quick. ;-)

carlocab commented 1 year ago

Oops, before you do that, there's a typo in my patch.

Edit: Try that.

carlocab commented 1 year ago

Actually, use this one, I always forget how these %W arrays work:

diff --git a/Formula/openjdk.rb b/Formula/openjdk.rb
index b04e356a6de..7e36210a8be 100644
--- a/Formula/openjdk.rb
+++ b/Formula/openjdk.rb
@@ -89,7 +89,6 @@ class Openjdk < Formula
     args += if OS.mac?
       %W[
         --enable-dtrace
-        --with-extra-ldflags=-headerpad_max_install_names
         --with-sysroot=#{MacOS.sdk_path}
       ]
     else
@@ -99,6 +98,7 @@ class Openjdk < Formula
         --with-fontconfig=#{HOMEBREW_PREFIX}
       ]
     end
+    args << "--with-extra-ldflags=-headerpad_max_install_names -Wl,-rpath,#{loader_path}/server"

     chmod 0755, "configure"
     system "./configure", *args
deejgregor commented 1 year ago

Actually, use this one

The updated patch using args << ... works great! I applied this to both openjdk.rb and openjdk@11.rb and it fixed the problem with both. (openjdk@11 is the important one for me because our app currently only works on 11).

carlocab commented 1 year ago

Great; thanks for testing it. The patch as is isn't quite suitable to apply to the formula directly (we don't want -headpad_max_install_names on Linux), but I'm glad we've zeroed in on the right fix.

I'll look into opening a PR to fix it in a few days, but please feel free to beat me to it.

carlocab commented 1 year ago

Oh, one more thing. Can you tell me the output of

otool -l "$(brew --prefix openjdk)/libexec/openjdk.jdk/Contents/Home/lib/libjava.dylib" | grep -A2 LC_RPATH

please? After you've applied the build fix above, I mean.

deejgregor commented 1 year ago

I'm working on a PR. I'll take care of all of the openjdk versions that are affected that I am able to compile (11, 17, current; not sure about 8 at this point). I should have that worked up by tomorrow.

Does this seem like a decent way to go so that -headpad_max_install_names isn't set on Linux? There's a little bit of duplication, but it's pretty straightforward (and the original form you showed me worked just fine).

--- Formula/openjdk.rb
+++ Formula/openjdk.rb
@@ -89,7 +89,7 @@ class Openjdk < Formula
     args += if OS.mac?
       %W[
         --enable-dtrace
-        --with-extra-ldflags=-headerpad_max_install_names
+        --with-extra-ldflags=-headerpad_max_install_names\ -Wl,-rpath,#{loader_path}/server
         --with-sysroot=#{MacOS.sdk_path}
       ]
     else
@@ -97,6 +97,7 @@ class Openjdk < Formula
         --with-x=#{HOMEBREW_PREFIX}
         --with-cups=#{HOMEBREW_PREFIX}
         --with-fontconfig=#{HOMEBREW_PREFIX}
+        --with-extra-ldflags=-Wl,-rpath,#{loader_path}/server"
       ]
     end

otool results

After patching - openjdk / openjdk@11

openjdk / libjava

$ otool -l "$(brew --prefix openjdk)/libexec/openjdk.jdk/Contents/Home/lib/libjava.dylib" | grep -A2 LC_RPATH
          cmd LC_RPATH
      cmdsize 32
         path @loader_path/server (offset 12)
--
          cmd LC_RPATH
      cmdsize 32
         path @loader_path/. (offset 12)

openjdk@11 / libjava

$ otool -l "$(brew --prefix openjdk\@11)/libexec/openjdk.jdk/Contents/Home/lib/libjava.dylib" | grep -A2 LC_RPATH
          cmd LC_RPATH
      cmdsize 32
         path @loader_path/server (offset 12)
--
          cmd LC_RPATH
      cmdsize 32
         path @loader_path/. (offset 12)

openjdk@11 / libmanagement

Note: this was the one that was causing me problems.

$ otool -l "$(brew --prefix openjdk\@11)/libexec/openjdk.jdk/Contents/Home/lib/libmanagement.dylib" | grep -A2 LC_RPATH
          cmd LC_RPATH
      cmdsize 32
         path @loader_path/server (offset 12)
--
          cmd LC_RPATH
      cmdsize 32
         path @loader_path/. (offset 12)
$

Before patching -- openjdk@17

openjdk@17 / libjava

$ otool -l "$(brew --prefix openjdk\@17)/libexec/openjdk.jdk/Contents/Home/lib/libjava.dylib" | grep -A2 LC_RPATH
          cmd LC_RPATH
      cmdsize 32
         path @loader_path/. (offset 12)
deejgregor commented 1 year ago

@carlocab I opened a PR over at #111255 and I think it's ready to go, but I have one question mentioned in https://github.com/Homebrew/homebrew-core/pull/111255#issuecomment-1252862341:

Question: I'm not sure if this needs a new revision as mentioned here: https://github.com/Homebrew/homebrew-core/blob/HEAD/CONTRIBUTING.md#to-contribute-a-fix-to-the-foo-formula

carlocab commented 1 year ago

Closed in #111255.