jedisct1 / libsodium

A modern, portable, easy to use crypto library.
https://libsodium.org
Other
12.06k stars 1.72k forks source link

fix(aarch64): Move target #pragma after arm_neon.h include #1321

Closed tux3 closed 8 months ago

tux3 commented 8 months ago

Hello,

I'm submitting a fix for https://github.com/jedisct1/libsodium/issues/1314 that was root caused and suggested in https://github.com/android/ndk/issues/1945

Currently the #pragma attribute push is done before including arm_neon.h
In NDK r26, this can cause the attribute to override and apply to the functions inside arm_neon.h

I've build-tested this patch on desktop and mobile platforms

jedisct1 commented 8 months ago

I must confess that I don't follow the logic.

We need the neon and crypto definitions from that header, as they are used later.

Can you clarify why these features should be turned on after the header?

Also, the current build scripts, including in libsodium-stable, build libsodium fine on all Android targets. What issue exactly does that change address?

tux3 commented 8 months ago

I'm happy to admit I don't have a deep understanding of the issue either.

That being said, my understanding of the situation is that arm_neon.h is supposed to selectively enable the features it needs on a function-by-function basis, by applying the right __attribute__ on the right function, depending on what CPU features are used (some functions may require crypto, some only aes, or something else)

By putting the pragma before the include, the target attributes of the pragma will apply to every function defintion from that point on, even functions in the header that might want something different instead. I think the push is overwriting and interfering with attributes that the header should be setting (or not setting) on its own.

Can you clarify why these features should be turned on after the header?

I imagine that the rest of the file does need those attributes, since those libsodium functions seem to use crypto assembly instructions. So we still want to keep the pragma after the include, such that it applies to libsodium's own functions. The pragma is done instead of tagging each function very finely with what it needs like arm_neon.h does, but that's alright for this purpose.

Caveat emptor: I have no idea what I'm talking about, I'm making this up based on my reading of the linked issue!

Also, the current build scripts, including in libsodium-stable, build libsodium fine on all Android targets. What issue exactly does that change address?

I'm building through a Conan recipe that wraps around libsodium's build system.
The issue I see is the same error as in #1314:

error: always_inline function 'vget_high_p64' requires target feature 'crypto', but would be inlined into function 'vmull_high_p64' that is compiled without support for 'crypto'

I think this is because the pragma forces the inline one to require the crypto feature, but the caller explicitly asks for something else, so it does not require crypto. The always_inline makes this conflict.

I only observe the build failure with the recently released Android NDK r26. I'm not sure whether the current CI/build scripts in libsodium use the latest NDK. If they do then there is maybe a workaround we can do in the conan wrapper to make it do whatever libsodium-stable does (although if I've understood the issue correctly, which is not a given, I believe this is really a code issue and not a build system issue).

If the libsodium CI is still using an older NDK, it would build fine, since it seems previous versions of the header happen to not be affected by the issue (for reasons I didn't entirely take the time to understand).

jedisct1 commented 8 months ago

I don't understand how adding (not removing) features would interfere. That sounds counter-intuitive.

If it doesn't introduce any regression, let's do it, including in -stable

What happens with older versions of the NDK?

Also, Android is not the only aarch64 target.

My concern is that doing this might break other compilers such as gcc, or other targets (windows, linux, bsd, etc).

tux3 commented 8 months ago

I don't understand how adding (not removing) features would interfere. That sounds counter-intuitive.

I think I can explain that now.
Here's the definitions of these two functions:

// -------------------------------------------------------------------
// NOTE: This doesn't ask for any target feature
// Because it doesn't ask for anything, libsodium's pragma steps over it
// This means vget_high_p64 will require "crypto"
// -------------------------------------------------------------------
__ai poly64x1_t vget_high_p64(poly64x2_t __p0) {
  poly64x1_t __ret;
  __ret = __builtin_shufflevector(__p0, __p0, 1);
  return __ret;
}

It is always_include'd in this function:

// -------------------------------------------------------------------
// NOTE: The line noise has a __attribute__((target("aes"))) in there
// That has precedence over libsodium's pragma
// So this one DOES NOT require "crypto", it only requires "aes"
// -------------------------------------------------------------------
__ai __attribute__((target("aes"))) poly128_t vmull_high_p64(poly64x2_t __p0, poly64x2_t __p1) {
  poly128_t __ret;
  __ret = vmull_p64((poly64_t)(vget_high_p64(__p0)), (poly64_t)(vget_high_p64(__p1)));
  return __ret;
}

(These definition are pretty inscrutable, so I apologize for that)

Because the pragma will only add the feature to some of the functions (those that don't explicitly ask for something), there is trouble. We always_include something that was forced to require crypto into something that explicitly only required aes. The outer function does not declare that it wants crypto, but it always_includes something that we forced to require crypto, so that's a breach of the outer function's signature contract.

I'm confident there's no reason for the pragma to be before the include, the rationale being that each function should declare the features it needs. So arm_neon.h takes care of adding __attribute__((target())) where needed, and libsodium can do the same with a pragma for its own functions.

It's wrong for a library to apply attributes to someone else's header, so the pragma must be after.

What happens with older versions of the NDK?

I believe it works for r25 at least. I've not tested all earlier versions, though from the reasoning above the fix ought to be strictly less noisy. It messed with clang headers before and that happened to work, but with the fix I think we're strictly doing less funny business.

This would break something if an older NDK forgot to declare target features for some of its functions, and libsodium was happily stomping over clang to fix a clang bug. That's unlikely (and if it is, we'd still want to do that pragma selectively, only for hypothetical affected compilers that forgot to declare the features they need)

Also, Android is not the only aarch64 target.

I believe the fix applies to newer versions Clang, but should not be specific to Android. I predict other aarch64 targets would hit the same problem with sufficiently new versions of Clang (I've not checked)

My concern is that doing this might break other compilers such as gcc, or other targets (windows, linux, bsd, etc).

I've tested Windows, Linux, macOS, iOS, Android, generally on x86/x86_64/armv7/armv8/, except Win/Lin which where only x86{,_64). I probably won't have energy to test all other targets.

In the meantime I have the Conan recipe I use apply this patch as part of the build on top of 1.0.19. I'm fairly confident at this point the fix is correct and should be an improvement on all compilers, but I'm happy to wait or leave this pending further confirmation if necessary.

(note: edited several times, I can't help myself)