VowpalWabbit / vowpal_wabbit

Vowpal Wabbit is a machine learning system which pushes the frontier of machine learning with techniques such as online, hashing, allreduce, reductions, learning2search, active, and interactive learning.
https://vowpalwabbit.org
Other
8.47k stars 1.93k forks source link

Java jni including and using native libraries inside jar #1579

Open ampm78 opened 6 years ago

ampm78 commented 6 years ago

The released Java jni currently includes the MacOs native library as a resource inside the jar, but this is not actually used (it will still look for the native library in the "java.library.path"). This forces users to install the native library in their systems if they want to use the vw_jni.

I pushed some code to extract the lib from inside the jar and load it in my clone repo (https://github.com/ampm78/vowpal_wabbit/commit/a957fa7f80ea4bee201619c73713466216ea192b) but this will only work at the moment for MacOs as that's the only native lib currently included in the jar.

Ideally the native libraries for at least the major platforms (MacOs, Linux, Windows?) should be built and included in the jar when releasing it, as other java libraries do (for example tensorflow).

JohnLangford commented 5 years ago

@jmorra @deaktator your thoughts here?

deaktator commented 5 years ago

@JohnLangford @jmorra @ampm78: I think this depends on the status of the build process. Previous to 8.4.1, we included the binaries for several OS variants in the jar. To the best of my recollection, the primary reason these were removed was related to two primary factors:

  1. the release process for Java was independent of the C lib release process.
  2. At the time, boost program options was dynamically linked in the C build.

These issues led to version conflicts with the boost program opts library and made it extremely hard to use because the library would be tied to the boost lib against which it was built. You’d have to simlink multiple boost libs and change them to make the JNI lib work.

Some of this detail is in the readme file is the java dir in the VW project.

If we can statically link the boost library in the C build and use a CI/CD service to release everything, then we can run the docker script we used previously to build the library for several OSs. I think that Travis has OS support which would allow us to build for major Linux distros and OS X. Not sure about windows support.

The OS detection code was a little “quirky” (if I remember correctly) and it’s often difficult to distinguish between some Linux distros.

@JohnLangford it’s been a while since I looked at the current status of the build process. Do you think we can build and release via Travis? And can we statically link against boost program opts?

JohnLangford commented 5 years ago

I'm unsure, but several more people are becoming involved so it's possible we can engineer a more comprehensive release process.

andrusha commented 4 years ago

It would be convenient to only need to include vw as java dep and then it would take care of the rest. This would enable / ease the use of VW on services like Dataflow.

eisber commented 4 years ago

@deaktator the static linking of the native components works on linux (https://github.com/VowpalWabbit/vowpal_wabbit/blob/master/java/CMakeLists.txt#L57) @jackgerrits has faith that it also works on Windows...

As for the build/release pipelines:

  1. Setup 3x Java (Linux, Mac, Windows) builds outputting statically linked artifacts (use yaml matrix feature to make it convenient).
  2. Setup a release pipeline that takes in the 3x artifacts, copies into target/* so that "mvn package" can pickup all the native dependencies.

The 2 Java APIs (old & spark) handle lib loading differently. Spark loads from jar (it actually extracts to temp folder), the old assumes its on the lib path. That should be unified. https://github.com/scijava/native-lib-loader/ is a library I originally used, but it didn't support dependency (e.g. boost which was packaged into the jar too). Now since static linking works, we can switch back to the library which has a couple of heuristics to detect the OS.

https://github.com/scijava/native-lib-loader/blob/master/src/main/java/org/scijava/nativelib/NativeLibraryUtil.java#L49 contains the expected directory structure.