rust-lang / rust

Empowering everyone to build reliable and efficient software.
https://www.rust-lang.org
Other
97.86k stars 12.67k forks source link

c_char signedness doesn't match with Clang's default on various no-std and tier 3 targets #129945

Open taiki-e opened 1 month ago

taiki-e commented 1 month ago

As an next step of #122985 ("c_char on AIX should be u8"), I checked all builtin targets' c_char (as of nightly-2024-09-03) using the following script.

check_c_char.sh ```bash #!/usr/bin/env bash set -euo pipefail IFS=$'\n\t' export RUSTC="${RUSTC:-rustc}" CLANG="${CLANG:-clang}" "${RUSTC}" -vV >&2 echo >&2 "${CLANG}" --version >&2 echo >&2 cargo new --lib tmp cd tmp failed='' for target_spec in $("${RUSTC}" -Z unstable-options --print all-target-specs-json | jq -c '. | to_entries[]'); do eval "$(jq -r '@sh "RUST_TARGET=\(.key) LLVM_TARGET=\(.value."llvm-target")"' <<<"${target_spec}")" clang_defs=$("${CLANG}" -E -dM -x c /dev/null -target "${LLVM_TARGET}") clang_c_char=i8 if grep -Fq '__CHAR_UNSIGNED__ 1' <<<"${clang_defs}"; then clang_c_char=u8 fi cat >src/lib.rs <&1 || :; } | grep -Fq 'error[E0308]'; then echo "${RUST_TARGET}: should be ${clang_c_char}" failed=1 fi done cd .. rm -rf tmp if [[ -n "${failed}" ]]; then exit 1 fi ```

The result (stdout of the script) is:

aarch64-kmc-solid_asp3: should be u8
aarch64-unknown-hermit: should be u8
aarch64-unknown-illumos: should be u8
aarch64-unknown-none: should be u8
aarch64-unknown-none-softfloat: should be u8
aarch64-unknown-redox: should be u8
aarch64-unknown-teeos: should be u8
aarch64-unknown-trusty: should be u8
armebv7r-none-eabi: should be u8
armebv7r-none-eabihf: should be u8
armv4t-none-eabi: should be u8
armv5te-none-eabi: should be u8
armv7-sony-vita-newlibeabihf: should be u8
armv7-unknown-trusty: should be u8
armv7a-kmc-solid_asp3-eabi: should be u8
armv7a-kmc-solid_asp3-eabihf: should be u8
armv7a-none-eabi: should be u8
armv7a-none-eabihf: should be u8
armv7r-none-eabi: should be u8
armv7r-none-eabihf: should be u8
armv8r-none-eabihf: should be u8
csky-unknown-linux-gnuabiv2: should be i8
csky-unknown-linux-gnuabiv2hf: should be i8
hexagon-unknown-none-elf: should be u8
riscv32i-unknown-none-elf: should be u8
riscv32im-risc0-zkvm-elf: should be u8
riscv32im-unknown-none-elf: should be u8
riscv32ima-unknown-none-elf: should be u8
riscv32imac-esp-espidf: should be u8
riscv32imac-unknown-none-elf: should be u8
riscv32imac-unknown-nuttx-elf: should be u8
riscv32imac-unknown-xous-elf: should be u8
riscv32imafc-esp-espidf: should be u8
riscv32imafc-unknown-none-elf: should be u8
riscv32imafc-unknown-nuttx-elf: should be u8
riscv32imc-esp-espidf: should be u8
riscv32imc-unknown-none-elf: should be u8
riscv32imc-unknown-nuttx-elf: should be u8
riscv64-linux-android: should be u8
riscv64gc-unknown-hermit: should be u8
riscv64gc-unknown-none-elf: should be u8
riscv64gc-unknown-nuttx-elf: should be u8
riscv64imac-unknown-none-elf: should be u8
riscv64imac-unknown-nuttx-elf: should be u8
thumbv4t-none-eabi: should be u8
thumbv5te-none-eabi: should be u8
thumbv6m-none-eabi: should be u8
thumbv6m-nuttx-eabi: should be u8
thumbv7em-none-eabi: should be u8
thumbv7em-none-eabihf: should be u8
thumbv7em-nuttx-eabi: should be u8
thumbv7em-nuttx-eabihf: should be u8
thumbv7m-none-eabi: should be u8
thumbv7m-nuttx-eabi: should be u8
thumbv8m.base-none-eabi: should be u8
thumbv8m.base-nuttx-eabi: should be u8
thumbv8m.main-none-eabi: should be u8
thumbv8m.main-none-eabihf: should be u8
thumbv8m.main-nuttx-eabi: should be u8
thumbv8m.main-nuttx-eabihf: should be u8
x86_64-unknown-l4re-uclibc: should be i8
stderr of the script (version info, etc.) ``` rustc 1.83.0-nightly (bd53aa3bf 2024-09-02) binary: rustc commit-hash: bd53aa3bf7a24a70d763182303bd75e5fc51a9af commit-date: 2024-09-02 host: aarch64-apple-darwin release: 1.83.0-nightly LLVM version: 19.1.0 Homebrew clang version 18.1.8 Target: arm64-apple-darwin23.5.0 Thread model: posix InstalledDir: /opt/homebrew/opt/llvm/bin Creating library `tmp` package note: see more `Cargo.toml` keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html clang: warning: unknown platform, assuming -mfloat-abi=soft clang: warning: unknown platform, assuming -mfloat-abi=soft clang: warning: unknown platform, assuming -mfloat-abi=soft clang: warning: no target microcontroller specified on command line, cannot link standard libraries, please pass -mmcu= [-Wavr-rtlib-linking-quirks] error: unknown target triple 'xtensa-none-unknown-elf' ```
Note about the above script (potential false positive / false negative) - Note that targets that broken and cannot build core (such as m68k: https://github.com/rust-lang/rust/issues/89498) are ignored in the above script. (potential false negative) - The script passes llvm-target field of the target spec as a target for Clang, so if it does not match the Rust target, it may cause a false positive or false negative.

As stated in the title, there are many target_os = "none" target and tier3 targets that do not match.

References:

@rustbot label +A-abi +O-Arm +O-AArch64 +O-riscv +O-csky +O-bare-metal +O-android +O-illumos +O-hermit

tgross35 commented 1 month ago

This would be a great check to have in CI, probably for all the types since I'm sure at least one long vs. long long is wrong wrong somewhere. Could just be an rmake test on platforms where we know we have Clang available, with a list of expected inaccuracies that we can correct over time.

taiki-e commented 2 weeks ago

On MSP430 (tier 3), both Clang and rustc don't match with the ABI's default char type...

Section 2.1 "Basic Types" in MSP430 Embedded Application Binary Interface says:

The char type is unsigned by default.

However clang doesn't set __CHAR_UNSIGNED__:

$ clang -E -dM -x c /dev/null -target msp430-none-elf | grep __CHAR             
#define __CHAR16_TYPE__ unsigned short
#define __CHAR32_TYPE__ unsigned int
#define __CHAR_BIT__ 8
tgross35 commented 2 weeks ago

Something tangentially related is that C compilers expose -funsigned-char, which is used by e.g. the kernel. It would be nice if this could be set via a core target feature to avoid casting when working with CStr and friends.

taiki-e commented 2 weeks ago

Something tangentially related is that C compilers expose -funsigned-char, which is used by e.g. the kernel. It would be nice if this could be set via a core target feature to avoid casting when working with CStr and friends.

In Rust, it may be more complicated than “like C compilers”, given that c_char in core requires Rust 1.64, so there are custom c_char definitions in various places (e.g., libc::c_char), and also you need to rebuild core to actually enable using target feature (even if libc is updated to reference that target feature, the build may fail with type mismatch if core is not rebuilt).

tgross35 commented 2 weeks ago

I mistyped - I meant to say "Cargo feature" and not "target feature", as in something that would need to be used with build-std.