Open taiki-e opened 1 month ago
This would be a great check to have in CI, probably for all the types since I'm sure at least one long
vs. long long
is wrong wrong somewhere. Could just be an rmake test on platforms where we know we have Clang available, with a list of expected inaccuracies that we can correct over time.
On MSP430 (tier 3), both Clang and rustc don't match with the ABI's default char type...
Section 2.1 "Basic Types" in MSP430 Embedded Application Binary Interface says:
The char type is unsigned by default.
However clang doesn't set __CHAR_UNSIGNED__
:
$ clang -E -dM -x c /dev/null -target msp430-none-elf | grep __CHAR
#define __CHAR16_TYPE__ unsigned short
#define __CHAR32_TYPE__ unsigned int
#define __CHAR_BIT__ 8
Something tangentially related is that C compilers expose -funsigned-char
, which is used by e.g. the kernel. It would be nice if this could be set via a core
target feature to avoid casting when working with CStr
and friends.
Something tangentially related is that C compilers expose
-funsigned-char
, which is used by e.g. the kernel. It would be nice if this could be set via acore
target feature to avoid casting when working withCStr
and friends.
In Rust, it may be more complicated than “like C compilers”, given that c_char
in core
requires Rust 1.64, so there are custom c_char
definitions in various places (e.g., libc::c_char
), and also you need to rebuild core
to actually enable using target feature (even if libc is updated to reference that target feature, the build may fail with type mismatch if core
is not rebuilt).
I mistyped - I meant to say "Cargo feature" and not "target feature", as in something that would need to be used with build-std
.
As an next step of #122985 ("c_char on AIX should be u8"), I checked all builtin targets' c_char (as of nightly-2024-09-03) using the following script.
check_c_char.sh
```bash #!/usr/bin/env bash set -euo pipefail IFS=$'\n\t' export RUSTC="${RUSTC:-rustc}" CLANG="${CLANG:-clang}" "${RUSTC}" -vV >&2 echo >&2 "${CLANG}" --version >&2 echo >&2 cargo new --lib tmp cd tmp failed='' for target_spec in $("${RUSTC}" -Z unstable-options --print all-target-specs-json | jq -c '. | to_entries[]'); do eval "$(jq -r '@sh "RUST_TARGET=\(.key) LLVM_TARGET=\(.value."llvm-target")"' <<<"${target_spec}")" clang_defs=$("${CLANG}" -E -dM -x c /dev/null -target "${LLVM_TARGET}") clang_c_char=i8 if grep -Fq '__CHAR_UNSIGNED__ 1' <<<"${clang_defs}"; then clang_c_char=u8 fi cat >src/lib.rs <The result (stdout of the script) is:
stderr of the script (version info, etc.)
``` rustc 1.83.0-nightly (bd53aa3bf 2024-09-02) binary: rustc commit-hash: bd53aa3bf7a24a70d763182303bd75e5fc51a9af commit-date: 2024-09-02 host: aarch64-apple-darwin release: 1.83.0-nightly LLVM version: 19.1.0 Homebrew clang version 18.1.8 Target: arm64-apple-darwin23.5.0 Thread model: posix InstalledDir: /opt/homebrew/opt/llvm/bin Creating library `tmp` package note: see more `Cargo.toml` keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html clang: warning: unknown platform, assuming -mfloat-abi=soft clang: warning: unknown platform, assuming -mfloat-abi=soft clang: warning: unknown platform, assuming -mfloat-abi=soft clang: warning: no target microcontroller specified on command line, cannot link standard libraries, please pass -mmcu=Note about the above script (potential false positive / false negative)
- Note that targets that broken and cannot build core (such as m68k: https://github.com/rust-lang/rust/issues/89498) are ignored in the above script. (potential false negative) - The script passes llvm-target field of the target spec as a target for Clang, so if it does not match the Rust target, it may cause a false positive or false negative.As stated in the title, there are many
target_os = "none" target
and tier3 targets that do not match.target_os = "none" target
, most of them probably do not match with Clang's default.c_char
was originally defined instd
and this mismatch was missed when it was moved tocore
.c_char
directly on these targets, but this includes a lot of tier 2 targets, so changing this to match Clang could have a not small impact on embedded ecosystem.References:
char
data type is unsigned."https://gcc.gnu.org/wiki/avr-gcc#Type_Layout says char is signed
@rustbot label +A-abi +O-Arm +O-AArch64 +O-riscv +O-csky +O-bare-metal +O-android +O-illumos +O-hermit