Open tgross35 opened 6 months ago
To help increase awareness of these changes, I was also considering writing a short blog post or blurb in the release notes before 1.77. Not sure which is a better option here, if anything.
Verifying: does this mean u128/i128 are now compatible with LLVM/clang and GCC on Linux targets, and with MSVC on Windows -windows-msvc targets, and with MinGW on Windows -windows-gnu targets?
A mention in the release notes would be a good idea. And since this has been an issue for so long, I personally think it'd also be wonderful to have a dedicated blog post highlighting the new compatibility and the people who worked so hard to bring about that compatibility.
Nominating this for lang team discussion.
Verifying: does this mean u128/i128 are now compatible with LLVM/clang and GCC on Linux targets, and with MSVC on Windows -windows-msvc targets, and with MinGW on Windows -windows-gnu targets?
To the best of my knowledge, the compatibility matrix for x86 Linux looks like this:
compiler 1 | compiler 2 | status |
---|---|---|
any gcc | clang < 18 (current stable) | storage compatible with ABI bug |
any gcc | clang >= 18 (near stable) | fully compatible |
rustc < 1.77 | any gcc | incompatible |
rustc < 1.77 | any clang | incompatible |
rustc >= 1.77 with LLVM < 18 | any gcc | storage compatible with ABI bug |
rustc >= 1.77 with LLVM < 18 | any clang | storage compatible with ABI bug |
rustc >= 1.77 with LLVM >= 18 | gcc | fully compatible |
rustc >= 1.77 with LLVM >= 18 | clang < 18 | storage compatible with ABI bug |
rustc >= 1.77 with LLVM >= 18 | clang >= 18 | fully compatible |
rustc >= 1.77 with LLVM >= 18
will include rustc 1.78 with bundled LLVM.
The ABI bug comes from LLVM incorrectly splitting an i128
between a register and the stack, meaning it is a function parameter at an offset where that can happen - the fix for reference.
I don't know of any other architecture incompatibilities but also have no way of verifying that, though LLVM appears to do the right thing for other ABIs I could find. Can't seem to link to that comment directly, but if you search "It probably makes sense to have reasoning" on https://reviews.llvm.org/D86310, my sources list should pop up.
As far as I know, MSVC does not support 128-bit integers, so LLVM just assumes its windows targets use the same size/alignment as Linux. Have to imagine if they ever add support they would make it match, but I'll ping some Windows people on Zulip to confirm.
Given that rustc
can and does build against older LLVM, especially in the case of distros (I could easily see someone on Debian ending up with rustc-1.77+LLVM=16), do we think we could remove the lint, but add in codegen_llvm
a warn!
or similar about what's going on? We don't want the lint to be backend conditional, since the lint is at the language level, but if the user is about to walk into a known bug it seems rude to remove their notice.
Finally, I haven't analyzed this, but what's the status of i128
/u128
with codegen_gcc
or codegen_cranelift
? Do we need to add similar warnings there if we remove this lint? I'm guessing codegen_gcc
is fine, since gcc
worked at the beginning, but I don't know whether codegen_cranelift
has a compatible layout.
Given that
rustc
can and does build against older LLVM, especially in the case of distros (I could easily see someone on Debian ending up with rustc-1.77+LLVM=16), do we think we could remove the lint, but add incodegen_llvm
awarn!
or similar about what's going on? We don't want the lint to be backend conditional, since the lint is at the language level, but if the user is about to walk into a known bug it seems rude to remove their notice.
Even easier, we just fire this lint only if an i128
happens to be passed at a 40 byte offset where it would hit the spilling issue 😄
Jokes aside, a backend-dependent lint sounds feasible. My only question is the extent we want to go to warn about this bug, since it only shows up in very specific circumstances, and C-C compatibility is wonky anyways (if you have LLVM < 18 then you probably also have Clang < 18 and would suffer from this same issue there).
Finally, I haven't analyzed this, but what's the status of
i128
/u128
withcodegen_gcc
orcodegen_cranelift
? Do we need to add similar warnings there if we remove this lint? I'm guessingcodegen_gcc
is fine, sincegcc
worked at the beginning, but I don't know whethercodegen_cranelift
has a compatible layout.
I just tested locally with -Zcodegen-backend=cranelift
and it correctly reports 16 bytes. I don't have cg_gcc ready to go but @antoyo would know.
A quick check says it reported alignment of 8 at some point https://rust.godbolt.org/z/GMK4sqcK7, but I guess cg-gcc on godbolt hasn't been updated since August.
On the topic of platform-specifc invalid_ctypes, we need to figure out what to do for f16
and f128
too. We intend to be ABI-compatible on all platforms, but there are a handful of platforms that don't have a __float128
/_Float128
in C to compare against.
@joshtriplett did any discussion happen here yet?
We discussed this in today's @rust-lang/lang design meeting.
We came to a consensus among those present (@pnkfelix, @tmandry, @nikomatsakis, and myself) that it is OK to enable this conditional on having a fixed LLVM. Without setting a fully general policy here, we think it's OK to have the availability of this feature depend on your LLVM version.
We'd like to see some mechanism by which people get an error if they try to use this with a non-fixed LLVM. We didn't try to settle the question of the exact form the error should take. We do want to make sure it's hard to end up in this situation.
Some additional non-consensus speculation that some folks were supportive of: To a first approximation a hard error might be OK, but we may need to weaken that to a very very visible overridable error (not suppressed by cap-lints) that allows people to compile code on a Rust using old LLVM as long as they never touch the actual function that uses 128-bit types.
Pinging @scottmcm to ensure you have the opportunity to raise any objections you might have to this.
What's the current status of this? Has there been progress on the idea of permitting this conditional on Rust being built with a sufficiently new LLVM?
I haven't done anything with it yet, but will pick it up at some point if nobody beats me to it
As far as I know, MSVC does not support 128-bit integers, so LLVM just assumes its windows targets use the same size/alignment as Linux. Have to imagine if they ever add support they would make it match, but I'll ping some Windows people on Zulip to confirm.
It's probably worth pointing out that while the MSVC ABI does not officially support 128-bit integers, Clang targeting x86_64-pc-windows-msvc
does but unfortunately it returns 128-bit integers in xmm0
where as rustc uses caller-allocated memory.
#[no_mangle]
pub extern "C" fn extend(num: i16) -> i128 {
num as i128
}
Notice return type is void
and sret
first parameter:
define void @extend(ptr dead_on_unwind noalias nocapture noundef writable writeonly sret([16 x i8]) align 16 dereferenceable(16) %_0, i16 noundef signext %num) unnamed_addr {
start:
%0 = sext i16 %num to i128
store i128 %0, ptr %_0, align 16
ret void
}
extend:
mov rax, rcx
movsx rcx, dx
mov qword ptr [rax], rcx
sar rcx, 63
mov qword ptr [rax + 8], rcx
ret
__int128 extend(short num) {
return num;
}
Notice return type is i64
vector:
define dso_local noundef <2 x i64> @extend(i16 noundef %num) local_unnamed_addr {
entry:
%conv = sext i16 %num to i128
%0 = bitcast i128 %conv to <2 x i64>
ret <2 x i64> %0
}
Return in xmm0
:
extend: # @extend
movsx rax, cx
movq xmm0, rax
sar rax, 63
movq xmm1, rax
punpcklqdq xmm0, xmm1 # xmm0 = xmm0[0],xmm1[0]
ret
So we still are not ABI compatible with Clang on x86_64-pc-windows-msvc
.
@wesleywiser do you happen to know if there is a reason clang elected to use xmm? Assuming this wasn't a decision based on performance / register pressure, it seems like an odd choice to break from sysv.
https://github.com/cplusplus/papers/issues/1793 seems to be progressing so I wonder if we will get actual guidance from Microsoft in the near future.
For a while, Rust's 128-bit integer types have been incompatible with those from C. The original issue is here https://github.com/rust-lang/rust/issues/54341, with some more concise background information at the MCP here https://github.com/rust-lang/compiler-team/issues/683
The current Beta of 1.77 will have https://github.com/rust-lang/rust/pull/116672, which manually sets the alignment of
i128
to make it ABI-compliant with any version of LLVM (clang
does something similar now). 1.78 will have LLVM18 as the vendored version which fixes the source of this error.Proposal: now that we are ABI-compliant, do not raise
improper_ctypes
on our 128-bit integers. I did some testing with abi-cafe and a more isolated https://github.com/tgross35/quick-abi-check during the time https://reviews.llvm.org/D86310 was being worked on, and verified everything lines up. (It would be great to have some fork of abi-cafe in tree, but that is a separate discussion.)@joshtriplett mentioned that changing this lint needs a lang FCP https://rust-lang.zulipchat.com/#narrow/stream/187780-t-compiler.2Fwg-llvm/topic/LLVM.20alignment.20of.20i128/near/398422037. cc @maurer
Reference change from when I was testing https://github.com/rust-lang/rust/pull/113880/commits/c742908c4b9abde264b8c5e9663e31c649a47f2f
Edit: blog at https://blog.rust-lang.org/2024/03/30/i128-layout-update.html has more background information.