jedisct1 / libsodium

A modern, portable, easy to use crypto library.
https://libsodium.org
Other
12.06k stars 1.72k forks source link

Update OS used to build .NET package #1233

Closed bfren closed 1 year ago

bfren commented 1 year ago

Currently the build and test jobs for the .NET package run on Ubuntu 16.04 - now Xenial is no longer receiving maintenance updates this should change.

bfren commented 1 year ago

The issue is the run-test-binaries-os-versions job, which tests compatibility with CentOS 7 and Debian 10.

So far I have used the following for the build-linux-glibc-xx jobs:

Each time the error is:

Run chmod +x .libsodium-builds/linux-x64/Tests
  chmod +x .libsodium-builds/linux-x64/Tests
  .libsodium-builds/linux-x64/Tests
  shell: sh -e {0}
  env:
    DOTNET_SYSTEM_GLOBALIZATION_INVARIANT: 1
Unhandled exception. System.DllNotFoundException: Unable to load shared library 'libsodium' or one of its dependencies. In order to help diagnose loading problems, consider setting the LD_DEBUG environment variable: liblibsodium: cannot open shared object file: No such file or directory
   at Program.sodium_version_string()
   at Program.Main() in /__w/libsodium/libsodium/.libsodium-test/Program.cs:line 8
/__w/_temp/10cd1428-5f50-4743-9b21-01f1bbd2452a.sh: line 2:    39 Aborted                 (core dumped) .libsodium-builds/linux-x64/Tests
Error: Process completed with exit code 134.

However using Debian 10 as the build platform (supported until 30 June 2024) means the Debian 10 tests pass (all the other tests pass on all the above) - so the main problem is CentOS 7.

If CentOS 7 compatibility is important, is it worth considering a build process for CentOS 7 separate to other platforms?

jedisct1 commented 1 year ago

I'd love to ditch CentOS 7 compatibility, but people are going to complain that it is still supported by the vendor.

ektrah commented 1 year ago

.NET unfortunately masks the actual error: It tries to load libsodium from a number of places and using different spellings (e.g., by adding a .so suffix and/or a lib prefix to the library name). If it cannot load and initialize the library, it throws a DllNotFoundException saying "liblibsodium: cannot open shared object file: No such file or directory", even if the library exists in one of the places and spellings.

The solution to finding the actual error is given in the exception message: "In order to help diagnose loading problems, consider setting the LD_DEBUG environment variable". My guess is that this will show an "Undefined reference to memcpy@GLIBC_2.14" error.

ektrah commented 1 year ago

When compiling libsodium.so on ubuntu:22.04, loading that file fails on both centos:7 and debian:10 with the following error:

checking for version `GLIBC_2.3' in file /lib64/ld-linux-x86-64.so.2 [0] required by file /__w/libsodium/libsodium/.libsodium-builds/linux-x64/libsodium.so [0]
checking for version `GLIBC_2.33' in file /lib64/libc.so.6 [0] required by file /__w/libsodium/libsodium/.libsodium-builds/linux-x64/libsodium.so [0]
/lib64/libc.so.6: error: version lookup error: version `GLIBC_2.33' not found (required by /__w/libsodium/libsodium/.libsodium-builds/linux-x64/libsodium.so) (fatal)

This is independent of whether __asm__(".symver memcpy,memcpy@GLIBC_2.2.5"); is present or not.

When compiling on ubuntu:20.04 or ubuntu:18.04 and loading on centos:7:

checking for version `GLIBC_2.3' in file /lib64/ld-linux-x86-64.so.2 [0] required by file /__w/libsodium/libsodium/.libsodium-builds/linux-x64/libsodium.so [0]
checking for version `GLIBC_2.2.5' in file /lib64/libpthread.so.0 [0] required by file /__w/libsodium/libsodium/.libsodium-builds/linux-x64/libsodium.so [0]
checking for version `GLIBC_2.14' in file /lib64/libc.so.6 [0] required by file /__w/libsodium/libsodium/.libsodium-builds/linux-x64/libsodium.so [0]
checking for version `GLIBC_2.3.4' in file /lib64/libc.so.6 [0] required by file /__w/libsodium/libsodium/.libsodium-builds/linux-x64/libsodium.so [0]
checking for version `GLIBC_2.4' in file /lib64/libc.so.6 [0] required by file /__w/libsodium/libsodium/.libsodium-builds/linux-x64/libsodium.so [0]
checking for version `GLIBC_2.25' in file /lib64/libc.so.6 [0] required by file /__w/libsodium/libsodium/.libsodium-builds/linux-x64/libsodium.so [0]
/lib64/libc.so.6: error: version lookup error: version `GLIBC_2.25' not found (required by /__w/libsodium/libsodium/.libsodium-builds/linux-x64/libsodium.so) (fatal)

When compiling on ubuntu:16.04, everything works (without __asm__(".symver memcpy,memcpy@GLIBC_2.2.5");).

So it seems that, until CentOS 7 reaches end-of-life (or .NET drops support for it), we're stuck with compiling libsodium.so for .NET on ubuntu:16.04...


glibc versions of distributions:

Distribution glibc version EOL
CentOS 7.7.1908 2.17 2024-06
Ubuntu 16.04 2.23 2026-04
Ubuntu 18.04 2.27 2028-04
Debian 10 2.28 2022-09
Ubuntu 20.04 2.31 2030-04
Debian 11 2.31 2024-07
Fedora 35 2.34 2022-12
Fedora 36 2.35 2023-05
Ubuntu 22.04 2.35 2032-04

libc compatibility of .NET itself:

  • x64: glibc 2.17 (from CentOS 7)
  • Arm32, Arm64: glibc 2.23 (from Ubuntu 16.04)
  • Alpine (x64 and Arm64): musl 1.2.2 (from Alpine 3.13)
jedisct1 commented 1 year ago

Another option is to compile with Zig, that can cross-compile to many targets, including specific libc versions:

Example:

zig build -Drelease-fast -Dtarget=x86_64-linux-gnu.2.17
jedisct1 commented 1 year ago

Looks like this is the best way to go.

dotnet-core.yml is now using zig to build Linux binaries.

Compilation can thus be done on current Linux distributions, while targeting older glibc versions. So, we target glibc 2.17 for x86_64 and glibc 2.23 for ARM, according to your table.

Compilation to aarch64 and arm with musl was fixed by the way (previously, the aarch64 build was actually an x86_64 one).

jedisct1 commented 1 year ago

Adding Windows ARM would probably also be nice.

zig build -Drelease-fast -Dtarget=aarch64-windows gives us a DLL and a .LIB for Windows ARM. But I guess this is impossible to test on GitHub CI, right?

bfren commented 1 year ago

Is 1.0.18.3 waiting on this?

jedisct1 commented 1 year ago

1.0.18.3 has been published :)

ektrah commented 1 year ago

Some unit tests in NSec that passed with 1.0.18.2 are now failing with 1.0.18.3 ☹️

It looks like something changed in or around crypto_aead_aes256gcm_decrypt?

ektrah commented 1 year ago

And it seems the pre-compiled binary built for linux-musl-arm isn't actually included in the NuGet package...

jedisct1 commented 1 year ago

Ah, looks like that has to be added to libsodium.pkgproj.

bfren commented 1 year ago

Ahh, that must also have got clobbered like the version bump I did.

bfren commented 1 year ago

Some unit tests in NSec that passed with 1.0.18.2 are now failing with 1.0.18.3 ☹️

It looks like something changed in or around crypto_aead_aes256gcm_decrypt?

Is this to do with 408125a72b5cbf0ccd9e478dae6b90f8737d3ee7?

jedisct1 commented 1 year ago

Some unit tests in NSec that passed with 1.0.18.2 are now failing with 1.0.18.3 ☹️ It looks like something changed in or around crypto_aead_aes256gcm_decrypt?

Is this to do with 408125a?

Very unlikely. This code doesn't exist in stable.

jedisct1 commented 1 year ago

On a failed decryption, the message buffer is now filled with 0xd0 instead of 0x00 (in the tradition of OpenBSD, making it more obvious that something went wrong instead of an all-null content).

The NSec test should check that the output buffer doesn't match the plaintext rather than expect a specific output on failure.

jedisct1 commented 1 year ago

What do the other failing tests do?

ektrah commented 1 year ago

All three of the other failing tests are decrypting a valid ciphertext using crypto_aead_aes256gcm_decrypt. However:

Previously, in each case, the function returned zero; now it returns a non-zero value.

The last case puzzles me the most because it is the simplest possible test case: encrypt a plaintext and decrypt the resulting ciphertext with exactly the same key, nonce and additional data. How could that possibly fail? 😕

ektrah commented 1 year ago

The simplest possible test case fails if clen >= 256.

jedisct1 commented 1 year ago

Oops! That should be fixed by a5ea347381991c7c4c0ca9701428e53677c65f8a

Our test suite was missing tests with long inputs. This case is now tested.