NeuronRobotics / nrjavaserial

A Java Serial Port system. This is a fork of the RXTX project that uses in jar loading of the native code.
Other
344 stars 142 forks source link

Support for musl #215

Closed mk868 closed 3 years ago

mk868 commented 3 years ago

Hello,

Currently I'm working on the project which will be run on the Alpine Linux. Alpine Linux by default uses musl instead of GNU C Library. From what I see, the nrjavaserial linux libraries were precompiled using glibc only.

Is it possible to add libs compiled with musl to nrjavaserial? From what I see, the following steps will be needed:

Thanks

MrDOS commented 3 years ago

That seems like a reasonable solution. “Add checking for musl” might be the tricky part; typically, the Java runtime is totally ignorant about the glibc. Short of environment sniffing, I think the easiest way to do this would probably be to try to load the glibc version, and if that fails, try the musl version.

Every native variant we add makes the library larger. I suspect most people using musl are doing so by way of Alpine, and are running on x86_64. I think I'd want to dip my toes into this pond by shipping just that variant, and then wait and see if anyone asks for anything else. (ARMv8, maybe?)

To tip this request on its head, an alternative approach might be to always use musl (or perhaps diet libc) and statically link it. I think that might inflate binary sizes too much, but it would be worth checking.

For right now, are you able to compile the native libraries yourself on a musl-based system to get on with your project? On your musl system, you can rebuild the 64-bit Linux natives by running make -C src/main/c/ linux64 from the root of the NRJavaSerial source tree.

mk868 commented 3 years ago

Adding checking if musl exists can be troublesome, especially if we want to be sure that we will not break anything that worked before.

try to load the glibc version, and if that fails, try the musl version.

This might be the way, It's safe for existing applications, I'll test it

I agree with you, the x86_64 option seems to be the most wanted by most developers. If necessary, to support more exotic builds, we can prepare an additional maven package with extra native libs:

<dependency>
    <groupId>com.neuronrobotics</groupId>
    <artifactId>nrjavaserial-musl-for-all-platforms</artifactId>
</dependency>

From my side, I was able to compile the library for alpine x64 without any problems. Then I run java app with command java -DlibNRJavaSerial.userlib=/root/alpinebuild/libNRJavaSerial.so -jar app.jar to force use alpine compliant library. The app successfully started, looks promising.

mk868 commented 3 years ago

A little update from my side

I started this issue because my application using nrjavaserial was not working on Alpine (it throws an exception when trying to load the native library). My lib version was the latest from Maven Central 5.2.1. After short research I guessed that it's a problem with glibc dependencies:

alpine:~/compare# ldd nrjavaserial-5.2.1/native/linux/x86_64/libNRJavaSerial.so
        /lib/ld-musl-x86_64.so.1 (0x7f197ba9d000)
        libc.so.6 => /lib/ld-musl-x86_64.so.1 (0x7f197ba9d000)
Error relocating nrjavaserial-5.2.1/native/linux/x86_64/libNRJavaSerial.so: __strcat_chk: symbol not found
Error relocating nrjavaserial-5.2.1/native/linux/x86_64/libNRJavaSerial.so: __snprintf_chk: symbol not found
Error relocating nrjavaserial-5.2.1/native/linux/x86_64/libNRJavaSerial.so: __open_2: symbol not found
Error relocating nrjavaserial-5.2.1/native/linux/x86_64/libNRJavaSerial.so: __fdelt_chk: symbol not found
Error relocating nrjavaserial-5.2.1/native/linux/x86_64/libNRJavaSerial.so: __stpcpy_chk: symbol not found
Error relocating nrjavaserial-5.2.1/native/linux/x86_64/libNRJavaSerial.so: __strcpy_chk: symbol not found
Error relocating nrjavaserial-5.2.1/native/linux/x86_64/libNRJavaSerial.so: __printf_chk: symbol not found
Error relocating nrjavaserial-5.2.1/native/linux/x86_64/libNRJavaSerial.so: __sprintf_chk: symbol not found

Today I did tests with the native library straight from the master branch and surprisingly it works:

alpine:~/compare# ldd nrjavaserial-master/src/main/c/resources/native/linux/x86_64/libNRJavaSerial.so
        /lib/ld-musl-x86_64.so.1 (0x7f2269e69000)
        libc.so.6 => /lib/ld-musl-x86_64.so.1 (0x7f2269e69000)

As far as I understand the libraries were compiled without any specific glibc dependencies and now work fine with musl also. Do I think right?

MrDOS commented 3 years ago

Interesting. The big difference between the native libraries in master and v5.2.1 is that the new ones – post-#189 – are compiled with -U_FORTIFY_SOURCE. This was done to lower the glibc dependency. Apparently it has the side effect of eliminating dependencies on all glibc symbols not also exported by musl. This wasn't intentional, and I couldn't have foreseen it, but I'm certainly not displeased by it!

If I look for implementation-specific entries in the symbol table:

$ git rev-parse --short HEAD
6c54eeb
$ sha256sum src/main/c/resources/native/linux/x86_64/libNRJavaSerial.so
14e7991aafe1d4a4c648dc003c7b8c0e2bd15bf8338368ddb8ea959fb6688ffc  src/main/c/resources/native/linux/x86_64/libNRJavaSerial.so
$ objdump -T src/main/c/resources/native/linux/x86_64/libNRJavaSerial.so | grep GLIBC | grep '_[A-Z_]'
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 __errno_location
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.7   __isoc99_fscanf
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 __lxstat
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.4   __stack_chk_fail
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 __xstat
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.7   __isoc99_sscanf
0000000000000000      DF *UND*  0000000000000000  GLIBC_2.2.5 __fxstat
0000000000000000  w   DF *UND*  0000000000000000  GLIBC_2.2.5 __cxa_finalize

...I can find a result for each one in the musl source tree. But if I look for the symbols which are used by the native library in v5.2.1 and not master:

$ diff \
> <(objdump -T ~/Downloads/libNRJavaSerial-v5.2.1.so | awk '/GLIBC/ && /_[A-Z_]/ {print $(NF)}' | sort) \
> <(objdump -T src/main/c/resources/native/linux/x86_64/libNRJavaSerial.so | awk '/GLIBC/ && /_[A-Z_]/ {print $(NF)}' | sort)
3d2
< __fdelt_chk
8,11d6
< __open_2
< __printf_chk
< __snprintf_chk
< __sprintf_chk
13,15d7
< __stpcpy_chk
< __strcat_chk
< __strcpy_chk

...I can't find any of them.

I'm not very familiar with musl, and the documentation I can find is nonspecific about binary compatibility with specific glibc versions; their FAQ (Ctrl+F “Is musl compatible with glibc?”) only goes as far as to say:

Binary compatibility is much more limited, but it will steadily increase with new versions of musl. At present, some glibc-linked shared libraries can be loaded with musl, but all but the simplest glibc-linked applications will fail if musl is dropped-in in place of /lib/ld-linux.so.2.

I'm trying to get more automated testing in place (starting with #182, with the intent of automatically running these tests on real hardware). Maybe a reasonable approach to musl compatibility is to just keep compiling as we are now, and add automation to confirm that the library remains loadable and functional in an Alpine container. Does that sound OK to you?

mk868 commented 3 years ago

Yes, of course, that sounds good to me. With the "newer" compilation method, there is a support for systems with musl at no cost to .jar file size.

Well... I am looking forward to version 5.3.0. Thank you so much for your support

MrDOS commented 3 years ago

Sounds good!