bytedeco / gradle-javacpp

Gradle plugins to automate the build process of JavaCPP and JavaCV
Other
57 stars 20 forks source link

Improve JPMS support: preinstall native libraries in jlink runtime #21

Closed HGuillemet closed 2 years ago

HGuillemet commented 2 years ago

Here is an attempt to improve support for creating jlink image by pre-extracting the native libraries in the runtime. It has the following advantages compared to linking native jars :

This PR defines a new Gradle task javacppBuildNativeModule which performs the following steps: 1) Obtain the module path from the main source set (as computed by the java plugin), and the main class from the application configuration. 2) Determine the set of java class dependencies by walking from the main class, using the Javassist bytecode editor and its ability to read the constant pool table. This is better than reading all classes from the whole module path since we end up with a much smaller set. It also avoids the case where some classes in the presets need other presets not included in the artifact dependencies (seen with opencv with some classes needing cpython). 3) Retain from this set only the classes annotated with the JavaCPP Properties. 4) Call JavaCPP Loader.load on each of the retained classes, with system properties org.bytedeco.javacpp.cachedir and org.bytedeco.javacpp.cachedir.nosubdir set to populate a temporary lib directory with all, and only, the required native libraries. 5) Compile a module-info.java containing module org.bytedeco.javacpp.libs {} 6) Build a jmod containing this module descriptor and the content of the lib directory. This allows to delegate to jlink the work of installing the native libraries in the proper runtime directory, depending on the platform. 7) Call jlink with the jmod added to the module path and --add-modules org.bytedeco.javacpp.libs.

Here is an example build script using kotlin DSL:


plugins {
    application
    id("org.bytedeco.gradle-javacpp-platform") version "1.5.8-SNAPSHOT"
    id("org.beryx.jlink") version "2.25.0"
}

extra["javacppPlatform"] = "linux-x86_64"

dependencies {
    implementation("org.bytedeco:opencv-platform:4.5.5-1.5.7")
}

group = "org.bytedeco"
version = "1.0-SNAPSHOT"
description = "gradle-jlink-sample"

application {
    mainModule.set("org.bytedeco.sample")
    mainClass.set("org.bytedeco.sample.Application")
    applicationDefaultJvmArgs = listOf("--add-modules", "ALL-MODULE-PATH")
}

jlink {
    addExtraModulePath("build/native/native.jmod");
    addOptions("--add-modules", "org.bytedeco.javacpp.libs")
}

tasks.named("jlink") {
        dependsOn("javacppBuildNativeModule")
}

Tested and working on a simple application using opencv and javafx, but for some obscure reasons I didn't investigate:

saudet commented 2 years ago

Why are you trying to hack this with JMOD? For this to make sense, we need to show it works well without JMOD! I don't think the JDK considers native libraries to part of the module system, but please someone correct me if I'm wrong @johanvos? @mikehearn? @AlanBateman? For example, if I do the following on my Linux machine, everything seems to work perfectly fine for JNI and jlink without any trace of JMOD, and without JavaCPP extracting anything in ~/.javacpp or anywhere else:

git clone https://github.com/bytedeco/sample-projects
cd sample-projects/opencv-stitching-jlink
mvn clean package
sed -i s/opencv-platform/opencv/g pom.xml
mvn clean package
unzip -j ~/.m2/repository/org/bytedeco/openblas/0.3.19-1.5.7/openblas-0.3.19-1.5.7-linux-x86_64.jar -d ./target/maven-jlink/default/lib
unzip -j ~/.m2/repository/org/bytedeco/opencv/4.5.5-1.5.7/opencv-4.5.5-1.5.7-linux-x86_64.jar -d ./target/maven-jlink/default/lib
ln -s libopenblas.so.0 ./target/maven-jlink/default/lib/libopenblas_nolapack.so.0
./target/maven-jlink/default/bin/stitch panorama_image1.jpg panorama_image2.jpg --output panorama_stitched.jpg

Am I missing something? Why do we need JMOD?

HGuillemet commented 2 years ago

While I agree with you that some standard must emerge about the handling of native libraries by the artifacts distribution system, by the java compiler, by the java runtime and by jlink, this PR is meant to propose the best technical solution with the tools we have today, not to prove anything, like that jmod is hell.

I don't think the JDK considers native libraries to part of the module system

jmod was mainly introduced for bundling native libraries needed by modules. In a jmod, the library are in a special directory in the archive and treated as such by link which copy them in the proper directory.

It's possible to do without jmod here and install manually the libraries in the image runtime AFTER jlink has run, like you did. That's also what I do in one of my application. But:

For these reasons delegating the installation of the libraries in the image to jlink using a temporary, pure-native, jmod seems the easiest for me.

Concerning the JavaCPP- specific questions of whether unzipping the native jars can be enough instead of calling Loader.load, I think only you can say when this will always work or what is the best.

saudet commented 2 years ago

It still feels like what you want to do is only tangentially related to JavaCPP. Why not create some generic plugin that also works for other libraries that chose not to use JavaCPP?

HGuillemet commented 2 years ago

This can be implemented in a separate plugin if you prefer, but I doubt it can be used in another framework than JavaCPP. Steps 3 and 4 above are JavaCPP specific. What do you think of them ? Is there a better way to do it ? Any idea about the 2 problems mentioned at the end of the PR description?

saudet commented 2 years ago

This can be implemented in a separate plugin if you prefer, but I doubt it can be used in another framework than JavaCPP. Steps 3 and 4 above are JavaCPP specific. What do you think of them ? Is there a better way to do it ?

I'm not sure I understand what you're trying to do there because, for example, how are you going to make this work with the JMOD files from say JavaFX? jlink is just going to put everything, unfiltered, we have no control over that. I think this is all very specific to your application, which is fine, but if the goal is to make a tool, this needs to be generalized a bit more, in my opinion. Maybe what you want is an improved version of ProGuard that supports JNI? Android apparently already has something for other kinds of resources when we enable "shrinkResources", so why not come up with something like that, but for native libraries?

Any idea about the 2 problems mentioned at the end of the PR description?

JavaCPP keeps trying to use the cache to rename libraries it finds in the library path, but we can easily work around that by disabling the cache entirely with a system property. Let me work on that...

HGuillemet commented 2 years ago

I'm not sure I understand what you're trying to do there

Just extracting the library from the native jar, but counting on the loader for that, instead of unzipping the jar, in order to execute the code that needs to be executed at this moment.

This is the point of the LoadEnabled interface IIRC. But what we would need here exactly is to trigger some onCache code, rather that code to be run when the lib is loaded in memory.

This reproduces more or less the behaviour of the cache mojo.

Also by determining the exact class dependencies and only loading those classes we limit the extraction to what is sufficient, and not the whole native jar.

Is there something else still unclear ?

saudet commented 2 years ago

I keep telling you, that's not general enough to be interesting. This is not useful for users not using jlink, but instead using something else like Android or GraalVM Native Image. I know you don't personally care about JavaFX, Android, or GraalVM, but there are so many more developers that do care about those vs jlink that it's not even funny. If you want to make this specific to JavaCPP, then make it work for other things than jlink. If you want to limit yourself to jlink, then make it work for other things than JavaCPP. Anyway, I understand you want to limit yourself to jilink and JavaCPP only, but I don't feel it's worth the time I would need to maintain this myself with all the dependencies your changes bring. However, there's nothing in your pull requests that depend on the current code in gradle-javacpp, so you could create another repository under https://github.com/bytedeco/ named "a-plugin-for-jlink-that-does-not-suck" or whatever you like and make releases in the org.bytedeco group. I'm perfectly OK with that.

In any case, I've added in commit https://github.com/bytedeco/javacpp/commit/0e07735fa625cc8effdd61c6291882aec2bd4858 a new system property "org.bytedeco.javacpp.cacheLibraries" that we can set to "false" to prevent JavaCPP from doing things in the cache with libraries.

saudet commented 2 years ago

Or, how about this, you take over gradle-javacpp and do whatever you want with it. This way I wouldn't need to worry about maintaining it. It's not a big plugin, but when things break, someone needs to fix it, and if that someone is you, that works. There hasn't been anything new added for over a year now, and I'm not planning on adding anything either, so there shouldn't be any problems letting you run things and see how it goes. Do you want to take that responsibility?