luhenry / netlib

An high-performance, hardware-accelerated implementation of Netlib in Java
Other
60 stars 12 forks source link

OSX Big Sur: Is it possible to use the Accelerate framework BLAS? #6

Open borice opened 2 years ago

borice commented 2 years ago

OSX Big Sur no longer ships with copies of the libraries on the filesystem so the option dev.ludovic.netlib.blas.nativeLibPath can't be used to point to veclib's libBLAS.dylib, etc.

Is there currently a way (or a plan) to make netlib work on OSX Big Sur (and later)? Thank you.

borice commented 2 years ago

As a follow-up, after further investigation I believe this is a "bug" in the JDK. Netlib uses System.loadLibrary (or System.load) to load the dynamic library (for BLAS, etc.), which in turn does exactly what the Big Sur release notes say not to do (e.g. checks for the existence of the library file on the filesystem and fails if not found).

I submitted a bug report via bugs.java.com but not sure how responsive they are to these reports. We'll see.

I have verified that attempting to load the library via dlopen even if the library file is not visible on the filesystem works fine.

borice commented 2 years ago

I also attempted to use OpenBlas installed via Homebrew (had to set JAVA_LIBRARY_PATH=/usr/local/opt/openblas/lib) and that seems to bypass (at least a bit) the problem with System.loadLibrary (since the libblas.dylib exists in the filesystem), but still fails with a weird error:

dyld: Library not loaded: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
  Referenced from: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/vecLib
  Reason: Incompatible library version: vecLib requires version 1.0.0 or later, but libBLAS.dylib provides version 0.0.0

I wasn't able to get farther than this... I tried moving libblas.dylib and liblapack.dylib out of the /usr/local/opt/openblas/lib/ in case there was some conflict between the OpenBlas version and LibVec version, and then set the dev.ludovic.netlib.blas.nativeLib=libopenblas.dylib (I also tried dev.ludovic.netlib.blas.nativeLib=openblas) but that didn't work... when trying to load the native BLAS implementation it still reports it can't load it:

2021-10-19 16:17:19,770 WARN netlib.InstanceBuilder$NativeBLAS: Failed to load implementation from:dev.ludovic.netlib.blas.JNIBLAS
2021-10-19 16:17:19,776 WARN netlib.InstanceBuilder$NativeBLAS: Failed to load implementation from:dev.ludovic.netlib.blas.ForeignLinkerBLAS
java.lang.RuntimeException: Unable to load native implementation
  at dev.ludovic.netlib.InstanceBuilder$NativeBLAS.getInstanceImpl(InstanceBuilder.java:67)
  ... 44 elided

No idea what else to try.

luhenry commented 2 years ago

Hi @borice , it's a good idea to support Accelerate and generally cblas API based libraries. I've so far made the choice to support only libraries that have the Fortran-based BLAS, but extending that to CBLAS would make sense.

The way I would tackle it is to add a JNICBLAS.java class alongside JNIBLAS.java, and add the corresponding code in blas/src/main/native/jni.c by extending generator.py. If that's something you want to tackle, I'd more than happy to review the change. Otherwise, I'll get to it in the coming days/weeks.

Thank you again!

borice commented 2 years ago

The JDK bug report was accepted and has been assigned a tracking ID.

oscar-broman commented 1 year ago

Any chance this could get some attention? The JDK bug has been fixed.

oscar-broman commented 1 year ago

If you could expand just slightly on your description above on how to do this, @luhenry, then I could give it a try.

luhenry commented 1 year ago

@oscar-broman hi, sorry for the late reply!

The approach would be similar to JNIBlas.java:

  1. the Java code is available in JNIBlas.java
  2. It calls into the generated file jni.c via JNI
  3. Which is itself generated with generator.py

In order to add support for CBLAS, I expect the easiest way is in modifying the generated jni.c file. To change it, you want to modify generator.py to generate the right code for each BLAS calls. This file is very self-contained and a pretty crude template generator, but it does the limited job at hand here.

Let me know if you have any other questions, I'll be very happy to answer them!

oscar-broman commented 4 months ago

I've gotten this to work without the need for cblas. I would need to clean things up a bit before it would be ready for a release, but I'm not quite sure how to deal with the dylib compilation. This can only be done on MacOS.

What approach would you suggest?

luhenry commented 4 months ago

@oscar-broman we can absolutely have a runtime check based on the platform and load a different library based on that. So on macos-x86_64 or macos-aarch64 we load the library you're suggesting based on the Accelerate framework. And on other platforms like linux-x86_64 or linux-aarch64, we load the library based on the cblas API.

oscar-broman commented 4 months ago

There bundled libraries (e.g. libnetlibblasjni.so), how would those be compiled prior to release?

I didn't actually make it work for the Accelerate framework, but rather compiled openblas and modified pom.xml so that it would compile a dylib instead. The only other modification, besides building binaries for MacOS, is to provide libhandle in the dlsym call in load_symbols.

gvonness commented 2 weeks ago

I have gotten blas and lapack working using the accelerate framework (using apple's fortran bindings), but I opted to not use dynamic linking. Instead I just compile link an alternative set of C sources with the accelerate framework. However, this required an update to the makefiles and pom itself.

I'm keen to work out the best approach to merging this functionality into this repo, if there is a wider interest.

oscar-broman commented 1 week ago

It is of interest for sure!