oracle / tribuo

Tribuo - A Java machine learning library
https://tribuo.org
Apache License 2.0
1.24k stars 172 forks source link

When packaged into docker container: FileNotFoundException: File /lib/linux-musl/x86_64/libxgboost4j.so #339

Closed nmicra closed 11 months ago

nmicra commented 1 year ago

Getting FileNotFoundException: File /lib/linux-musl/x86_64/libxgboost4j.so, when packaged into docker container. Working fine when running on local windows machine.

Is there any special requirements how the dependencies should be packaged into docker image? When googling for this exception, I've noticed that such a problem occurred when the compilation and packaging has been done on different OS. I just want to highlight that build/compilation & packaging on linux (Ubuntu 22.04.1 LTS) machine. And docker container is based on "Alpine Linux v3.14". Any recommendation is appreciated.

The stack trace is Caused by: java.lang.RuntimeException: java.io.FileNotFoundException: File /lib/linux-musl/x86_64/libxgboost4j.so was not found inside JAR. at ml.dmlc.xgboost4j.java.XGBoostJNI.<clinit>(XGBoostJNI.java:37) ~[xgboost4j_2.12-1.6.2.jar!/:na] at ml.dmlc.xgboost4j.java.DMatrix.<init>(DMatrix.java:109) ~[xgboost4j_2.12-1.6.2.jar!/:na] at org.tribuo.common.xgboost.XGBoostTrainer.convertExamples(XGBoostTrainer.java:543) ~[tribuo-common-xgboost-4.3.1.jar!/:na] at org.tribuo.regression.xgboost.XGBoostRegressionTrainer.train(XGBoostRegressionTrainer.java:252) ~[tribuo-regression-xgboost-4.3.1.jar!/:na] at org.tribuo.regression.xgboost.XGBoostRegressionTrainer.train(XGBoostRegressionTrainer.java:233) ~[tribuo-regression-xgboost-4.3.1.jar!/:na] at org.tribuo.regression.xgboost.XGBoostRegressionTrainer.train(XGBoostRegressionTrainer.java:73) ~[tribuo-regression-xgboost-4.3.1.jar!/:na] at org.tribuo.Trainer.train(Trainer.java:51) ~[tribuo-core-4.3.1.jar!/:na]

Version 4.3.1 implementation ("org.tribuo:tribuo-all:4.3.1@pom") { isTransitive = true }

Craigacp commented 1 year ago

XGBoost4j needs to be compiled from source on Linux distributions which use musl libc. Or you can install glibc (though on that specific version used in Tribuo 4.3.1 it tries to autodetect musl and that autodetection will confuse things).

There are issues in XGBoost's GitHub page with more details, e.g. https://github.com/dmlc/xgboost/pull/7921.

We have a note about XGBoost's binary support in the javadocs - https://github.com/oracle/tribuo/blob/main/Common/XGBoost/src/main/java/org/tribuo/common/xgboost/XGBoostTrainer.java#L65

nmicra commented 1 year ago

I've tried to install additional C/C++ libraries, as it was mentioned in the note, you've mentioned, but no luck.

Docker Image: openjdk:17-jdk-alpine3.14 Additional libraries:

Still facing exception: Caused by: java.io.FileNotFoundException: File /lib/linux-musl/x86_64/libxgboost4j.so was not found inside JAR. at ml.dmlc.xgboost4j.java.NativeLibLoader.createTempFileFromResource(NativeLibLoader.java:298) at ml.dmlc.xgboost4j.java.NativeLibLoader.loadLibraryFromJar(NativeLibLoader.java:241) at ml.dmlc.xgboost4j.java.NativeLibLoader.initXGBoost(NativeLibLoader.java:176) at ml.dmlc.xgboost4j.java.XGBoostJNI.<clinit>(XGBoostJNI.java:34) ... 120 more Any recommendations? Can you recommend any other openjdk docker image where XGBoost can run smoothly?

Craigacp commented 1 year ago

If you use an image based on RHEL/CentOS/OL or Ubuntu it should use glibc and it'll be fine. In the next version we'll pull in the updated XGBoost with the override, but at the moment it's autodetecting musl and you need to compile XGBoost for musl to make it work.

nmicra commented 1 year ago

Just tried to switch to Ubuntu, I've used the mcr.microsoft.com/openjdk/jdk:17-ubuntu image, but got similar error. Caused by: [CIRCULAR REFERENCE: java.lang.UnsatisfiedLinkError: /tmp/libxgboost4j8168818628742633668.so: libgomp.so.1: cannot open shared object file: No such file or directory]

Craigacp commented 1 year ago

That's because you need OpenMP to get parallelization and the XGBoost binary the developers produce requires it. You can recompile XGBoost without it (and it'll be much slower), or install the libgomp1 package into your docker image.

nmicra commented 1 year ago

@Craigacp thank you! The following workaround worked for me.

  1. Create your image FROM mcr.microsoft.com/openjdk/jdk:17-ubuntu
  2. RUN apt-get update && apt-get install -y libgomp1 && rm -rf /var/lib/apt/lists/*