Closed bfren closed 1 year ago
The issue is the run-test-binaries-os-versions
job, which tests compatibility with CentOS 7 and Debian 10.
So far I have used the following for the build-linux-glibc-xx
jobs:
Each time the error is:
Run chmod +x .libsodium-builds/linux-x64/Tests
chmod +x .libsodium-builds/linux-x64/Tests
.libsodium-builds/linux-x64/Tests
shell: sh -e {0}
env:
DOTNET_SYSTEM_GLOBALIZATION_INVARIANT: 1
Unhandled exception. System.DllNotFoundException: Unable to load shared library 'libsodium' or one of its dependencies. In order to help diagnose loading problems, consider setting the LD_DEBUG environment variable: liblibsodium: cannot open shared object file: No such file or directory
at Program.sodium_version_string()
at Program.Main() in /__w/libsodium/libsodium/.libsodium-test/Program.cs:line 8
/__w/_temp/10cd1428-5f50-4743-9b21-01f1bbd2452a.sh: line 2: 39 Aborted (core dumped) .libsodium-builds/linux-x64/Tests
Error: Process completed with exit code 134.
However using Debian 10 as the build platform (supported until 30 June 2024) means the Debian 10 tests pass (all the other tests pass on all the above) - so the main problem is CentOS 7.
If CentOS 7 compatibility is important, is it worth considering a build process for CentOS 7 separate to other platforms?
I'd love to ditch CentOS 7 compatibility, but people are going to complain that it is still supported by the vendor.
.NET unfortunately masks the actual error: It tries to load libsodium
from a number of places and using different spellings (e.g., by adding a .so
suffix and/or a lib
prefix to the library name). If it cannot load and initialize the library, it throws a DllNotFoundException
saying "liblibsodium: cannot open shared object file: No such file or directory", even if the library exists in one of the places and spellings.
The solution to finding the actual error is given in the exception message: "In order to help diagnose loading problems, consider setting the LD_DEBUG environment variable". My guess is that this will show an "Undefined reference to memcpy@GLIBC_2.14" error.
When compiling libsodium.so
on ubuntu:22.04
, loading that file fails on both centos:7
and debian:10
with the following error:
checking for version `GLIBC_2.3' in file /lib64/ld-linux-x86-64.so.2 [0] required by file /__w/libsodium/libsodium/.libsodium-builds/linux-x64/libsodium.so [0]
checking for version `GLIBC_2.33' in file /lib64/libc.so.6 [0] required by file /__w/libsodium/libsodium/.libsodium-builds/linux-x64/libsodium.so [0]
/lib64/libc.so.6: error: version lookup error: version `GLIBC_2.33' not found (required by /__w/libsodium/libsodium/.libsodium-builds/linux-x64/libsodium.so) (fatal)
This is independent of whether __asm__(".symver memcpy,memcpy@GLIBC_2.2.5");
is present or not.
When compiling on ubuntu:20.04
or ubuntu:18.04
and loading on centos:7
:
checking for version `GLIBC_2.3' in file /lib64/ld-linux-x86-64.so.2 [0] required by file /__w/libsodium/libsodium/.libsodium-builds/linux-x64/libsodium.so [0]
checking for version `GLIBC_2.2.5' in file /lib64/libpthread.so.0 [0] required by file /__w/libsodium/libsodium/.libsodium-builds/linux-x64/libsodium.so [0]
checking for version `GLIBC_2.14' in file /lib64/libc.so.6 [0] required by file /__w/libsodium/libsodium/.libsodium-builds/linux-x64/libsodium.so [0]
checking for version `GLIBC_2.3.4' in file /lib64/libc.so.6 [0] required by file /__w/libsodium/libsodium/.libsodium-builds/linux-x64/libsodium.so [0]
checking for version `GLIBC_2.4' in file /lib64/libc.so.6 [0] required by file /__w/libsodium/libsodium/.libsodium-builds/linux-x64/libsodium.so [0]
checking for version `GLIBC_2.25' in file /lib64/libc.so.6 [0] required by file /__w/libsodium/libsodium/.libsodium-builds/linux-x64/libsodium.so [0]
/lib64/libc.so.6: error: version lookup error: version `GLIBC_2.25' not found (required by /__w/libsodium/libsodium/.libsodium-builds/linux-x64/libsodium.so) (fatal)
When compiling on ubuntu:16.04
, everything works (without __asm__(".symver memcpy,memcpy@GLIBC_2.2.5");
).
So it seems that, until CentOS 7 reaches end-of-life (or .NET drops support for it), we're stuck with compiling libsodium.so
for .NET on ubuntu:16.04
...
glibc versions of distributions:
Distribution | glibc version | EOL |
---|---|---|
CentOS 7.7.1908 | 2.17 | 2024-06 |
Ubuntu 16.04 | 2.23 | 2026-04 |
Ubuntu 18.04 | 2.27 | 2028-04 |
Debian 10 | 2.28 | 2022-09 |
Ubuntu 20.04 | 2.31 | 2030-04 |
Debian 11 | 2.31 | 2024-07 |
Fedora 35 | 2.34 | 2022-12 |
Fedora 36 | 2.35 | 2023-05 |
Ubuntu 22.04 | 2.35 | 2032-04 |
libc compatibility of .NET itself:
Another option is to compile with Zig, that can cross-compile to many targets, including specific libc versions:
Example:
zig build -Drelease-fast -Dtarget=x86_64-linux-gnu.2.17
Looks like this is the best way to go.
dotnet-core.yml
is now using zig
to build Linux binaries.
Compilation can thus be done on current Linux distributions, while targeting older glibc versions. So, we target glibc 2.17 for x86_64 and glibc 2.23 for ARM, according to your table.
Compilation to aarch64 and arm with musl was fixed by the way (previously, the aarch64 build was actually an x86_64 one).
Adding Windows ARM would probably also be nice.
zig build -Drelease-fast -Dtarget=aarch64-windows
gives us a DLL
and a .LIB
for Windows ARM. But I guess this is impossible to test on GitHub CI, right?
Is 1.0.18.3 waiting on this?
1.0.18.3 has been published :)
Some unit tests in NSec that passed with 1.0.18.2 are now failing with 1.0.18.3 ☹️
It looks like something changed in or around crypto_aead_aes256gcm_decrypt
?
And it seems the pre-compiled binary built for linux-musl-arm
isn't actually included in the NuGet package...
Ah, looks like that has to be added to libsodium.pkgproj
.
Ahh, that must also have got clobbered like the version bump I did.
Some unit tests in NSec that passed with 1.0.18.2 are now failing with 1.0.18.3 ☹️
It looks like something changed in or around
crypto_aead_aes256gcm_decrypt
?
Is this to do with 408125a72b5cbf0ccd9e478dae6b90f8737d3ee7?
Some unit tests in NSec that passed with 1.0.18.2 are now failing with 1.0.18.3 ☹️ It looks like something changed in or around
crypto_aead_aes256gcm_decrypt
?Is this to do with 408125a?
Very unlikely. This code doesn't exist in stable
.
On a failed decryption, the message buffer is now filled with 0xd0
instead of 0x00
(in the tradition of OpenBSD, making it more obvious that something went wrong instead of an all-null content).
The NSec test should check that the output buffer doesn't match the plaintext rather than expect a specific output on failure.
What do the other failing tests do?
All three of the other failing tests are decrypting a valid ciphertext using crypto_aead_aes256gcm_decrypt
. However:
DecryptWithAdOverlapping
, m
(length mlen
) and ad
(length adlen
) overlap in memory (m < ad < ad+adlen < m+mlen
).DecryptWithSpanInPlace
, m
and c
point at the same memory location (m == c
).DecryptWithSpanOutOfPlace
, none of the buffers overlap.Previously, in each case, the function returned zero; now it returns a non-zero value.
The last case puzzles me the most because it is the simplest possible test case: encrypt a plaintext and decrypt the resulting ciphertext with exactly the same key, nonce and additional data. How could that possibly fail? 😕
The simplest possible test case fails if clen >= 256
.
Oops! That should be fixed by a5ea347381991c7c4c0ca9701428e53677c65f8a
Our test suite was missing tests with long inputs. This case is now tested.
Currently the build and test jobs for the .NET package run on Ubuntu 16.04 - now Xenial is no longer receiving maintenance updates this should change.