tweag / sparkle

Haskell on Apache Spark.
BSD 3-Clause "New" or "Revised" License
447 stars 30 forks source link

Libraries are sometimes not searched in the packaged jar. #138

Closed facundominguez closed 6 years ago

facundominguez commented 6 years ago

Sparkle currently requests users to link with -Wl,-rpath,$ORIGIN. This has the effect of loading all the direct dependencies of the application from the folder containing the executable and the libraries when they are unpacked just before loading them into the jvm process.

Unfortunately, transitive dependencies which are not direct dependencies of the executable, may not be found, unless the libraries that need them also include $ORIGIN in the rpath.

I found this was the case with libltdl.so.7, which is a dependency of libodbc.so.2, which doesn't have an rpath set.

A workaround is to make the non-direct dependencies direct, by including them when linking the executable. But ideally, the user shouldn't need to be aware of this problem.

One hacky option is to use patchelf to make sure $ORIGIN is always included in the rpath of all packaged libraries. Not sure if patchelf is guaranteed to work for every elf library out in the wild.

I think I'd like best if we set LD_LIBRARY_PATH to the temporary folder where libraries are extracted. Unfortunately, java doesn't provide a way to set environment variables at runtime. We could try to do it from C in a dedicated library that we load before trying to load the bulk of the application.

mboes commented 6 years ago

LD_LIBRARY_PATH sounds least painful, yes. If it works.

facundominguez commented 6 years ago

It seems changes to LD_LIBRARY_PATH doesn't influence System.load after the jvm is started.

An alternative to using patchelf is to use ldd -v or objdump -p in sparkle package to get the dependency tree and store in the jar a listing of the dependencies in topological order. When we load the application we use the listing to call System.load on each individual library.

facundominguez commented 6 years ago

Loading libraries individually doesn't work without further insight. Some of the libraries need to be loaded together or errors about missing symbols are raised. The libraries load fine if they are loaded by ldd when loading the main executable.

One other thing that we could try is to have sparkle package create a dummy shared library which links all of the libraries that are packaged in the jar. Then, in order to load the application, we just load this dummy library.

This would make sparkle package depend on gcc, but perhaps this is still better than trying to patch the libraries.

mboes commented 6 years ago

One other thing that we could try is to have sparkle package create a dummy shared library which links all of the libraries that are packaged in the jar.

I like that idea. Creative thinking!