"attempt to use an ARM instruction on a Thumb-only processor" building for thumbv7em-none-eabi

jonas-schievink commented 5 years ago

I've been trying to get ring working on thumb targets, and hit this error when running cargo build --target thumbv7em-none-eabi:

(...many similar lines...)
/home/jonas/dev/ring/target/thumbv7em-none-eabi/debug/build/ring-7d11fc9ee76feeac/out/ghashv8-armx-linux32.S:236: Error: attempt to use an ARM instruction on a Thumb-only processor -- `vrev64.8 q0,q0'
/home/jonas/dev/ring/target/thumbv7em-none-eabi/debug/build/ring-7d11fc9ee76feeac/out/ghashv8-armx-linux32.S:238: Error: attempt to use an ARM instruction on a Thumb-only processor -- `vext.8 q0,q0,q0,#8'
/home/jonas/dev/ring/target/thumbv7em-none-eabi/debug/build/ring-7d11fc9ee76feeac/out/ghashv8-armx-linux32.S:239: Error: attempt to use an ARM instruction on a Thumb-only processor -- `vst1.64 {q0},[r0]'
/home/jonas/dev/ring/target/thumbv7em-none-eabi/debug/build/ring-7d11fc9ee76feeac/out/ghashv8-armx-linux32.S:241: Error: attempt to use an ARM instruction on a Thumb-only processor -- `vldmia sp!,{d8,d9,d10,d11,d12,d13,d14,d15}'
/home/jonas/dev/ring/target/thumbv7em-none-eabi/debug/build/ring-7d11fc9ee76feeac/out/ghashv8-armx-linux32.S:242: Error: attempt to use an ARM instruction on a Thumb-only processor -- `bx lr'
thread 'main' panicked at 'execution failed', build.rs:657:9
note: Run with `RUST_BACKTRACE=1` environment variable to display a backtrace.

This could be due to the build script not detecting the architecture correctly, because as far as I can tell BoringSSL should work on thumb targets.

briansmith commented 5 years ago

cargo build --target thumbv7em-none-eabi:

What if you use cross to build for that target?

If you can't use cross then please share a script that installs all the necessary dependencies on Ubuntu.

jonas-schievink commented 5 years ago

cross isn't needed, the libcore for this target is built and distributed by Rust. You just need to run rustup target add thumbv7em-none-eabi.

briansmith commented 5 years ago

But, did you see if it works when cross is used? I think cross sets $CC and similar variables automatically in a way that avoids most problems. So, if it works in cross then we'd just need to figure out what cross does.

jonas-schievink commented 5 years ago

cross fails with the same error

JJJollyjim commented 5 years ago

I've been attempting to work through this (for the target thumbv7em-none-eabihf). I'm not sure if I've actually found a solution or not: I've run into a few other issues along the way:

The crate will only compile as no_std if the target is a unix or redox
A preprocessor issue which I resolved by using ARM's toolchain binaries rather than my disto's
Lots of files refer to libc::c_int and friends, which aren't present (the libc crate is empty on this family of targets...). I believe the standard approach here is to use https://github.com/japaric/cty?

JJJollyjim commented 5 years ago

There are also issues where cpu.rs's arm::Feature doesn't compile if you aren't in a hardcoded list of operating systems, and (understandably) there is no fill_impl for random number generation...

In any case it seems like support for bare-metal thumb targets might be trickier than just fixing the assembly issue...

briansmith commented 5 years ago

The crate will only compile as no_std if the target is a unix or redox

More generally, the crate won't build as no_std for any platform for which it is missing a CSPRNG adapter, as discussed in https://github.com/briansmith/ring/issues/744. If/when the ring::rand implementation for a particular platform is added, if that implementation doesn't depend on libstd, the std dependency for that platform will be removed. This will be clearer when #789 is fixed.

briansmith commented 5 years ago

Lots of files refer to libc::c_int and friends, which aren't present (the libc crate is empty on this family of targets...). I believe the standard approach here is to use https://github.com/japaric/cty?

Why not just implement the size types in the libc crate for that target? That's what the libc crate is for.

briansmith commented 5 years ago

There are also issues where cpu.rs's arm::Feature doesn't compile if you aren't in a hardcoded list of operating systems

There is no way to do feature detection without either knowing more about the target or asking the OS to do it. In your case, I imagine you don't want any runtime feature detection at all; instead you want to use the #[cfg(feature)] mechanism to detect whether NEON and friends are present at build time, right? The main question is "How do we know whether the user of ring wants to use runtime feature detection or static feature detection, from within Rust code?"

briansmith commented 5 years ago

"How do we know whether the user of ring wants to use runtime feature detection or static feature detection, from within Rust code?"

One solution that seems OK for now is to just say "For ARM (including AAarch64), if the OS is none then just do static feature detection and forgo any runtime feature detection. As long as the configuration is tested in ring's CI, I would take a PR that implements that. cpu.rs was designed to make that relatively easy to do.

briansmith commented 5 years ago

Lots of files refer to libc::c_int and friends, which aren't present (the libc crate is empty on this family of targets...). I believe the standard approach here is to use https://github.com/japaric/cty?

Why not just implement the size types in the libc crate for that target? That's what the libc crate is for.

Moved this part of the discussion to https://github.com/briansmith/ring/issues/744#issuecomment-499659196.

briansmith commented 5 years ago

I am doing some work to make ring with for wasm32-unknown-unknown and part of that was to make ring be #[no_std] all the time, with an alloc default feature gating use of the alloc crate and std gating access to libstd.

Removing the use of libc for such targets is the subject of issue #39. It looks to be very solvable for these Cortex-M CPUs without any third-party dependencies. The trickier sticking point is how to know generate random bytes for these targets; that should be its own issue.

Regarding the actual subject of this issue (as judged by the summary of the issue), the attempt to compile ARM instructions on thumb-only targets, I suggest we change build.rs or some other part of the build system to skip the problematic assembly files for thumb-only targets. But, how can we identify whether a target is thumb-only?

On top of that, I imagine for cortex-M we want to use static feature detection instead of runtime feature detection in cpu.rs. It seems doable. It would be nice if we didn't have to do it by enumerating every single thumb-only target_arch though.

jmgao commented 5 years ago

Regarding the actual subject of this issue (as judged by the summary of the issue), the attempt to compile ARM instructions on thumb-only targets, I suggest we change build.rs or some other part of the build system to skip the problematic assembly files for thumb-only targets. But, how can we identify whether a target is thumb-only?

It looks like the environment variable CARGO_CFG_TARGET_FEATURE will contain mclass in a comma separated list if it's a thumb-only target, but it actually seems like the problem is the inverse: we're compiling files that require ARMv8 (aesv8-armx.pl, ghashv8-armx.pl) or ARMv7+NEON (x25519-asm-arm.S) on all ARM platforms.

There are some improvements that can be made in the assembly files as well: they're not guarding their NEON implementations with __ARM_NEON__, so there are wasteful chunks of unexecutable code in several of the assembly files.

I'll try to carve out some time tonight or tomorrow to upload a patch here to improve build.rs's target detection and one to upstream BoringSSL to sprinkle #ifdef __ARM_NEON__s everywhere.

briansmith commented 5 years ago

we're compiling files that require ARMv8 (aesv8-armx.pl, ghashv8-armx.pl) or ARMv7+NEON (x25519-asm-arm.S) on all ARM platforms.

There are some improvements that can be made in the assembly files as well: they're not guarding their NEON implementations with __ARM_NEON__, so there are wasteful chunks of unexecutable code in several of the assembly files.

Consider somebody who wants to build an executable that runs on any ARMv6 system or later, and optionally use NEON if the CPU supports NEON, and optionally use AAarch32 instructions if the CPU supports AAarch32 instructions, using runtime feature detection. This is what ring allows today with the current scheme.

jmgao commented 5 years ago

Ah, right. Perhaps #if defined(__ARM_NEON__) || __ARM_ARCH_PROFILE != 'M', then? Presumably people targeting the embedded profile aren't going to be shipping generic binaries around.

briansmith commented 5 years ago

I think we should add a feature "static-feature-detection-only" that switches cpu.rs from using dynamic feature detection to static feature detection based on #[cfg(feature)]. Then build.rs can detect that feature has been activated and add a -DSTATIC_FEATURE_DETECTION_ONLY that will trigger the use of such #if based conditional compilation in the assembly code, maybe using __ARM_MAX_ARCH__ and friends.

jonathanpallant commented 4 years ago

So I've fallen across this issue attempting to compile https://github.com/cloudflare/quiche for thumbv8m.main-none-eabi. Some changes I've had to make to get as far as linker errors with undefined symbols include:

Add a define for OPENSSL_NO_ASM in build.rs - avoids "Can't execute ARM instructions in Thumb mode" compile error
Remove -Winline from build.rs - avoids "Inlining this function makes the code bigger" warnings, which become errors
Update to latest git version of cmake-rs and set c.pic(false) in build.rs (requires latest git version of cmake-rs) - avoids symbols going into the .got dynamic relocation section.
Patch rand.rs to handle `#[cfg(target_os = "none")] - I have a placeholder here while I work out how to generate random numbers on this platform

I'm guessing the first change I listed above (OPENSSL_NO_ASM) is the reason I'm getting undefined symbols for GFp_aes_nohw_encrypt and friends. Does ring have a non-ASM (i.e. C or Rust) version of these functions, and if not, should it?

briansmith commented 3 years ago

Add a define for OPENSSL_NO_ASM in build.rs - avoids "Can't execute ARM instructions in Thumb mode" compile error

IDK what the status of the Thumb support is for ring's assembly. I do think that at least some, maybe most, of the assembly is thumb compatible. Maybe some could be tweaked to become Thumb-compatbile. There might be some assembly we need to exclude on a case-by-case basis from Thumb builds, #ifdef style. If/when somebody gathers and contributes a list of non-Thumb-compatible assembly functions, we can make progress.

Remove -Winline from build.rs - avoids "Inlining this function makes the code bigger" warnings, which become errors

These warnings won't be considered errors if building from a packaged version of ring (from crates.io, or vendored from crates.io)

Update to latest git version of cmake-rs and set c.pic(false) in build.rs (requires latest git version of cmake-rs) - avoids symbols going into the .got dynamic relocation section.

IDK what cmake-rs has to do with it. But, it is probably the case that ring's build.rs needs to be changed w.r.t. PIC stuff. Should we just assume that cc-rs automatically does the right thing? If so, it would be great to have a citation to support that. Then we can remove any/all PIC-related stuff in build.rs.

I'm guessing the first change I listed above (OPENSSL_NO_ASM) is the reason I'm getting undefined symbols for GFp_aes_nohw_encrypt and friends. Does ring have a non-ASM (i.e. C or Rust) version of these functions, and if not, should it?

It does now, and has for a few months.

briansmith commented 3 years ago

We should add a feature "static-feature-detection-only" that switches cpu.rs from using dynamic feature detection to static feature detection based on #[cfg(feature)]

Cargo now sets CARGO_CFG_TARGET_FEATURE which I think we may be able to use in build.rs for this. And/or we just look at the target name?

In PR #1100 [edit: not 1101], I hard-coded the CPU feature set for ARM apple devices because at the time #[cfg(feature)] wasn't reporting the correct things. I think #[cfg(feature)] is now reporting the correct stuff for aarch64-apple-darwin targets. If so, we should be able to rewrite ARM Apple stuff in cpu.rs to use #[cfg(feature)] and then extend that to 32-bit ARM targets too.

In addition, I'm happy to move more dynamic CPU feature detection out of the assembly language and into the Rust code so that more static feature detection can be used.

But, also, recent ARM-targeting compilers do set various macros when a CPU feature is guaranteed to be provided. Look at how BoringSSL's include/openssl/cpu.h uses them. We should be able to use them in the assembly language files as well.

cameronelliott commented 2 years ago

Regarding the actual subject of this issue (as judged by the summary of the issue), the attempt to compile ARM instructions on thumb-only targets, I suggest we change build.rs or some other part of the build system to skip the problematic assembly files for thumb-only targets. But, how can we identify whether a target is thumb-only?

@briansmith I'm taking a peek at this, and was wondering about the best approach to achieve building on thumb targets.

I'm ignorant to much of Ring, but are you saying there are equivalent .c or .rs implementations for everything that is implemented in .S ? (I told you I'm ignorant to much of Ring :)

So, the right approach is to select a .c or .rs implementation for the thumb targets? (and avoid the failing .S build attempts) If that's true, I will start looking at what changes to build.rs or other files are needed to achieve that. Thanks

cameronelliott commented 2 years ago

@briansmith So, I'm learning. I wanted to see if I could get Ring to build for a thumb target. I managed to get it building with two changes:

Comment out the ARM specific entries in build.rs: " (&[ARM],....."
Write a bogus/dummy fill_impl()

With those changes, Ring compiles for my thumb target.

So the next two things will be:

figure how to tweak build.rs to avoid the ARM table entries on thumb targets (or expand the table selectors), but probably the first.
figure out a good/robust way to link-to/find a SRNG for my target triplet/quad.

I'm open to suggestions on how to do either or both.

Here is how I'm building: TARGET_CC=arm-none-eabi-gcc TARGET_AR=arm-none-eabi-gcc-ar cargo build --target=thumbv6m-none-eabi --release The hardware is basically the Raspberry Pico Pi or RP2040

cameronelliott commented 2 years ago

The RP2040 has a random bit generator that is not recommend for cryptography applications:

2.17.5. Random Number Generator If the system clocks are running from the XOSC and/or PLLs the ROSC can be used to generate random numbers. Simply enable the ROSC and read the RANDOMBIT register to get a 1-bit random number and read it n times to get an nbit value. This does not meet the requirements of randomness for security systems because it can be compromised, but it may be useful in less critical applications. If the cores are running from the ROSC

https://datasheets.raspberrypi.com/rp2040/rp2040-datasheet.pdf

https://docs.rs/rp2040-hal/latest/rp2040_hal/rosc/struct.RingOscillator.html#method.get_random_bit

briansmith commented 1 year ago

I'm ignorant to much of Ring, but are you saying there are equivalent .c or .rs implementations for everything that is implemented in .S ? (I told you I'm ignorant to much of Ring :)

There are C/Rust implementations of almost everything, but not absolutely everything.

So, the right approach is to select a .c or .rs implementation for the thumb targets?

I doubt that's the best approach. I believe upstream OpenSSL and probably BoringSSL have full support for Thumb targets in the assembly. It is hard to read the assembly code because it uses Perl (!) as a macro language, so all the assembly is embedded within Perl.

I suggest we start by stating the exact steps to reproduce, making no assumptions about any compilers being installed on the build machine or any environment variables being set. In particular, in order to cross-compile ring, you need a C compiler that can cross-compile for the target. So, let's see the steps to reproduce that involve installing, in some way, that cross-compiler, or using environment variables to override the compiler to use clang instead. Then let's see what errors we encounter when there is a working target C compiler/assembler.

jonathanpallant commented 1 year ago

If you haven't seen it, Bunnie has a pure-Rust version of ring (over in their Xous fork), allowing it to compile for RISC-V.

briansmith / ring

"attempt to use an ARM instruction on a Thumb-only processor" building for thumbv7em-none-eabi #813