lvh / caesium

Modern cryptography (libsodium/NaCl) for Clojure
Eclipse Public License 1.0
180 stars 28 forks source link

Need help with a JNI issue #81

Open schaueho opened 3 years ago

schaueho commented 3 years ago

Hi there, this is not a bug report but a request for help with a problem I have using caesium.

I use caesium successfully in a small project on AMD64, using caesium 0.14 on a Debian Buster system x86 with libsodium 1.0.17-1 installed. Now I would like to make that entire project run under docker on an ARM platform -- a Raspberry Pi 4. The base image I use is arm32v7/adoptopenjdk:16-jdk-hotspot-focal and I install libsodium23 and libsodium-dev, both of which are version 1.0.18-1. The problem is that I get segmentation faults as soon as my code tries to generate a hash. I can call my "other code" (i.e. that does not use any caesium code) just fine, so the following is not a general ARM/JDK issue. I would need some help how to proceed.

Here's the simple enough clojure code:

(defn hash-password [parameter]
  (timbre/debug "Creating hash for parameter")
  (u/hexify (hash/hash parameter)))

As I said above, that code works without problem on x86. But as soon as I run into this part on ARM, I get a crash:

api_1           | 2021-08-25T13:32:51.529Z 6c4feb75f9f3 DEBUG [api.db.core:11] - Creating hash for password
api_1           | #
api_1           | # A fatal error has been detected by the Java Runtime Environment:
api_1           | #
api_1           | #  SIGSEGV (0xb) at pc=0xff6a9c08, pid=1, tid=53
api_1           | #
api_1           | # JRE version: OpenJDK Runtime Environment AdoptOpenJDK-16.0.1+9 (16.0.1+9) (build 16.0.1+9)
api_1           | # Java VM: OpenJDK Server VM AdoptOpenJDK-16.0.1+9 (16.0.1+9, mixed mode, g1 gc, linux-arm)
api_1           | # Problematic frame:
api_1           | # C  [libc.so.6+0x63c08]

When I look into the generated logfile, I see the following:

Internal exceptions (20 events):
Event: 24.642 Thread 0xfe8130b8 Exception <a 'java/lang/UnsatisfiedLinkError'{0xcbfc5a58}: /lib/arm-l
inux-gnueabihf/libsodium.so: undefined symbol: crypto_core_ristretto255_scalar_is_canonical> (0xcbfc5
a58) 
thrown [./src/hotspot/share/prims/jni.cpp, line 600]
Event: 24.693 Thread 0xfe8130b8 Exception <a 'java/lang/UnsatisfiedLinkError'{0xcbfc8a18}: /lib/arm-l
inux-gnueabihf/libsodium.so: undefined symbol: crypto_core_ristretto255_scalar_is_canonical> (0xcbfc8
a18) 
thrown [./src/hotspot/share/prims/jni.cpp, line 600]

This looks to me like a version mismatch. However, what is weird to me is that the code just works fine with the old libsodium23 version (1.0.17-1) on my main machine under x86, despite issue #70 whereas the crashing version on ARM actually does use a 1.0.18-1.

Any idea?

lvh commented 3 years ago

Can you find that exact file being opened (ideally be very sure via syscall tracing) and list its symbols?

Maybe upgrading JNR would help?

schaueho commented 3 years ago

Here are the "ristretto" related symbols I can find:

foo@4acc21ad40ef:/lib/arm-linux-gnueabihf$ nm -gD libsodium.so | grep ristretto
00028e80 T crypto_core_ristretto255_add
00028fb0 T crypto_core_ristretto255_bytes
00028f38 T crypto_core_ristretto255_from_hash
00028fb8 T crypto_core_ristretto255_hashbytes
00028e68 T crypto_core_ristretto255_is_valid_point
00028fb4 T crypto_core_ristretto255_nonreducedscalarbytes
00028f44 T crypto_core_ristretto255_random
00028fa0 T crypto_core_ristretto255_scalar_add
00028fbc T crypto_core_ristretto255_scalarbytes
00028f9c T crypto_core_ristretto255_scalar_complement
00028f94 T crypto_core_ristretto255_scalar_invert
00028fa8 T crypto_core_ristretto255_scalar_mul
00028f98 T crypto_core_ristretto255_scalar_negate
00028f90 T crypto_core_ristretto255_scalar_random
00028fac T crypto_core_ristretto255_scalar_reduce
00028fa4 T crypto_core_ristretto255_scalar_sub
00028edc T crypto_core_ristretto255_sub
0002a070 T crypto_scalarmult_ristretto255
0002a0d0 T crypto_scalarmult_ristretto255_base
0002a118 T crypto_scalarmult_ristretto255_bytes
0002a11c T crypto_scalarmult_ristretto255_scalarbytes

This file is actually a symlink:

foo@4acc21ad40ef:/lib/arm-linux-gnueabihf$ ls -al libsodium.so
lrwxrwxrwx 1 root root 19 Aug 18  2019 libsodium.so -> libsodium.so.23.3.0
foo@4acc21ad40ef:/lib/arm-linux-gnueabihf$ dpkg -S libsodium.so.23.3.0 
libsodium23:armhf: /usr/lib/arm-linux-gnueabihf/libsodium.so.23.3.0
foo@4acc21ad40ef:/lib/arm-linux-gnueabihf$ dpkg -l libsodium23:armhf | grep ^ii 
ii  libsodium23:armhf 1.0.18-1     armhf        Network communication, cryptography and signaturing library

I'll look into upgrading JNR, but currently this looks to me like the package really doesn't contain the required symbol.

lvh commented 3 years ago

OK, so: looks like annoyingly the versions didn't get revved when new symbols got added, there's just 1.0.18-"stable" which has a ton of functionality that not everything called "1.0.18" has. I think the way to do this properly is to do feature detection?

schaueho commented 3 years ago

I'm getting lost here, because looking at the source files of libsodium in focal I see the function defined:

[pixie->libsodium]grep -r ristretto * | grep canonical
crypto_core/ed25519/ref10/ed25519_ref10.c:ristretto255_is_canonical(const unsigned char *s)
crypto_core/ed25519/ref10/ed25519_ref10.c:    if (ristretto255_is_canonical(s) == 0) {

My C foo is totally rusty, but if I'm not wrong, I don't see anything in the Makefile.am why this file should be excluded from compilation:

[pixie->libsodium]grep -B20 ed25519/ref10/ed25519_ref10.c Makefile.am 
lib_LTLIBRARIES = \
    libsodium.la

libsodium_la_SOURCES = \
    crypto_aead/chacha20poly1305/sodium/aead_chacha20poly1305.c \
    crypto_aead/xchacha20poly1305/sodium/aead_xchacha20poly1305.c \
    crypto_auth/crypto_auth.c \
    crypto_auth/hmacsha256/auth_hmacsha256.c \
    crypto_auth/hmacsha512/auth_hmacsha512.c \
    crypto_auth/hmacsha512256/auth_hmacsha512256.c \
    crypto_box/crypto_box.c \
    crypto_box/crypto_box_easy.c \
    crypto_box/crypto_box_seal.c \
    crypto_box/curve25519xsalsa20poly1305/box_curve25519xsalsa20poly1305.c \
    crypto_core/ed25519/ref10/ed25519_ref10.c \

But when I run nm on the distributed .so file, I don't see the symbol.

lvh commented 3 years ago

What happens if you build the package via apt-src?

schaueho commented 2 years ago

Sorry for not following up earlier, I was away on vacation. I just build the package via debian/rules build. If I then take a look at the symbols, I get the same result:

[pixie->libsodium-1.0.18]nm -gD ./build/src/libsodium/.libs/libsodium.so | grep ristretto
000000000002d170 T crypto_core_ristretto255_add
000000000002d3b0 T crypto_core_ristretto255_bytes
000000000002d2d0 T crypto_core_ristretto255_from_hash
000000000002d3d0 T crypto_core_ristretto255_hashbytes
000000000002d140 T crypto_core_ristretto255_is_valid_point
000000000002d3c0 T crypto_core_ristretto255_nonreducedscalarbytes
000000000002d2e0 T crypto_core_ristretto255_random
000000000002d370 T crypto_core_ristretto255_scalar_add
000000000002d360 T crypto_core_ristretto255_scalar_complement
000000000002d340 T crypto_core_ristretto255_scalar_invert
000000000002d390 T crypto_core_ristretto255_scalar_mul
000000000002d350 T crypto_core_ristretto255_scalar_negate
000000000002d330 T crypto_core_ristretto255_scalar_random
000000000002d3a0 T crypto_core_ristretto255_scalar_reduce
000000000002d380 T crypto_core_ristretto255_scalar_sub
000000000002d3e0 T crypto_core_ristretto255_scalarbytes
000000000002d220 T crypto_core_ristretto255_sub
000000000002ed60 T crypto_scalarmult_ristretto255
000000000002edf0 T crypto_scalarmult_ristretto255_base
000000000002ee50 T crypto_scalarmult_ristretto255_bytes
000000000002ee60 T crypto_scalarmult_ristretto255_scalarbytes

It looks like the sourcefile crypto_core/ed25519/ref10/ed25519_ref10.c is getting compiled, at least I see a reference in the Makefile that uses it to compile a crypto_core/ed25519/ref10/libsodium_la-ed25519_ref10.lo which I do find in the build directory. I guess it for whatever reasons just never linked into the .so file? I'm getting lost in the autoconf generated Makefile, sorry.