Open RalfJung opened 1 year ago
Looks like clang will split
struct P {
int8_t a;
float b;
} __packed__;
into two arguments on rv64gc with i8
and float
respectively as types. Also I think clang will tell LLVM that the struct is packed if necessary on various targets. I also found this function: https://github.com/llvm/llvm-project/blob/d1487670ee1caaf1762c341f9f96cfb07540b5c1/clang/include/clang/CodeGen/CGFunctionInfo.h#L247-L250
into two arguments
So that's <{ i8, f32 }>
? Or really two separate arguments to the function -- like Rust's ScalarPair? Or does that not even make a difference for the ABI?
In this case it seemed like two separate args were emitted. AFAIK LLVM will internally recursively split structs into separate arguments anyway though if byval
is not used. At least that is what it does for Webassembly.
The #[repr(packed)]
attribute explicitly defines and constrains the memory layout for types, while the extern "C"
for functions requires the function's calling convention to be the same as in the C language. On some architectures with only 32-bit and 64-bit register widths, separating the passing of i8
and f32
can improve access performance. So, as long as parameter passing matches the C calling convention of the target architecture, it is correct. Isn't it?
The issue is with how we turn the value as represented in memory into an argument that can be passed in registers. Currently we assume that the value as represented in memory and the llvm type we use for argument passing have the same layout, which is not the case for aligned/packed structs as the llvm type has a different amount of padding from the real type.
The #[repr(packed)] attribute explicitly defines and constrains the memory layout for types, while the extern "C" for functions requires the function's calling convention to be the same as in the C language.
The issue is: when I declare a packed type in C, and a packed type in Rust, then they should be ABI-compatible. But currently they are not. If you take the type P
from my example, declare an equivalent type in C, have a C function that takes a struct P
as argument, and call it from Rust, then things will explode in the most amazing ways.
Thanks for your explanation. :heart: I tried the test cases you mentioned, and indeed there is an issue when accessing packed structure fields through memory addresses. ~If I didn't make any mistakes, is x86_64 also wrong?~
t.rs:
#[repr(packed, C)]
struct P {
i: i8,
f: f32,
}
#[no_mangle]
extern "C" fn test(p: P) {
let ip = std::ptr::addr_of!(p.i);
let fp = std::ptr::addr_of!(p.f);
let i = unsafe { std::ptr::read_unaligned(ip) };
let f = unsafe { std::ptr::read_unaligned(fp) };
println!("{:p} {:p} {} {}", ip, fp, i, f);
}
t.c:
struct P
{
char i;
float f;
} __attribute__((packed));
extern void test(struct P p);
int
main (int argc, char *argv[])
{
struct P p;
p.i = 16;
p.f = 128.0;
test(p);
return 0;
}
run:
rustc --crate-type=cdylib -o libt.so t.rs; gcc -o t t.c -L . -l t; LD_LIBRARY_PATH=`pwd` ./t
x86_64 with #[repr(packed)]
and __attribute__((packed))
:
0x7ffc31f02520 0x7ffc31f02521 16 128
x86_64 without #[repr(packed)]
and __attribute__((packed))
:
0x7ffc5a60eaa0 0x7ffc5a60eaa4 16 128
loongarch64 with #[repr(packed)]
and __attribute__((packed))
:
0x7ffffbf10720 0x7ffffbf10721 16 0.000000000000000000000000000000000000023509886
loongarch64 without #[repr(packed)]
and __attribute__((packed))
:
0x7ffffba12f90 0x7ffffba12f94 16 128
Interesting. x86_64 uses indirect pass mode for packed structs, so... that can't really be an ABI difference. Not sure what is happening. Do the field offsets match on both sides?
Interesting. x86_64 uses indirect pass mode for packed structs, so... that can't really be an ABI difference. Not sure what is happening. Do the field offsets match on both sides?
rustc think x86_64 uses indirect pass mode for packed structs, but actual not.
Pass mode cast:
args: [
ArgAbi {
layout: TyAndLayout {
ty: P,
layout: Layout {
size: Size(5 bytes),
align: AbiAndPrefAlign {
abi: Align(1 bytes),
pref: Align(1 bytes),
},
abi: Aggregate {
sized: true,
},
fields: Arbitrary {
offsets: [
Size(0 bytes),
Size(1 bytes),
],
memory_index: [
0,
1,
],
},
largest_niche: None,
variants: Single {
index: 0,
},
max_repr_align: None,
unadjusted_abi_align: Align(1 bytes),
},
},
mode: Indirect {
attrs: ArgAttributes {
regular: NoAlias | NoCapture | NonNull | NoUndef,
arg_ext: None,
pointee_size: Size(5 bytes),
pointee_align: Some(
Align(1 bytes),
),
},
extra_attrs: None,
on_stack: true,
},
},
],
Passed by integer register with packed layout:
Breakpoint 1, 0x00007ffff7f694b0 in test () from /home/hev/ws/libt.so
(gdb) i r rdi
rdi 0x43000000f7ffda10 4827858804701911568 { 0x43000000, <0xf7ffda>, 0x10 }
I can't read assembly so I can't quite figure out what that gdb snippet at the end tells me.
Could you open a new issue for the x86_64 problem? It's different in nature from this one. This issue leads to data loss even when passing packed values between two extern "C"
Rust functions (the ABI implemented by Rust doesn't even work for pure Rust programs); the x86_64 problem is an incompatibility with the C ABI (that only surfaces when linking Rust and C code).
For this type:
#[repr(packed, C)]
struct P {
i: i8,
f: f32,
}
#[no_mangle]
extern "C" fn test(p: P) {
let ip = std::ptr::addr_of!(p.i);
let fp = std::ptr::addr_of!(p.f);
println!("{:p} {:p}", ip, fp);
}
The callee creates a temporary variable on the stack with the same type and alignment as the parameter, called var1
. On LoongArch, the argument passed in is not same as rust type, it is not packed
. A temporary variable of the same rust type is also created, called var2
. The value of the parameter will be written directly to temporary var1
. Finally, use llvm.memcpy
to copy var1
to var2
.
The issue is that var1
and var2
have different field layout and total size. So it seems that for packed
case we need to copy field by field?
The first question is, what even is the C ABI for packed types on this target, how do they get passed? Is it indirect, or are the fields spread across registers, and if yes how?
Only after answering that question then can we try to figure out how to encode that ABI in rustc + LLVM.
The first question is, what even is the C ABI for packed types on this target, how do they get passed?
@RalfJung That's a good point.
For the current 'packed' instance, which consists of two fields (one integer and the other floating-point), there is a unique case to consider. In the context of the calling convention of LoongArch, it's important to note that integers are passed via GAR (General Argument Register) while floating-point values are passed via FAR (Floating-point Argument Register).
I'm not sure how other ISAs handle this. From a professional Rust perspective, what approach do you think would be more suitable to start with?
So, this passing in registers applies even for packed structs? Sadly that document you linked doesn't mention packed structs at all.
I think this means that we want to generate an LLVM function with type <{ i8, float }>
(the angle brackets indicate a packed struct in LLVM). But this would then rely on the LLVM backend logic passing such an argument in registers for this target -- that's code I know absolutely nothing about. Cc @bjorn3 @nikic
If this hypothesis is correct, then we need to extend rustc CastTy
to get a packed: bool
field, which needs to be used when the CastTy is turned into an LLVM type (that part is easy), and which needs to be set as appropriate by ABI adjustments (that part is hard).
However, even with that we get into trouble for more complicated types, such as
#[repr(packed)]
struct P(i8, f32);
#[repr(C)]
struct Arg(i8, i32, P);
The document seems to say that this is passed the same as #[repr(C)] (i8, i32, i8, f32)
in term of registers, but of course the padding is different. I have no idea how to express that in LLVM. Do we need nested types in CastTy
, like { i8, i32, <{ i8, float }> }
?
This is a concern even without packed; { i8, { i8, i16 } }
is not the same as { i8, i8, i16 }
. I'm not sure what the plan was with rustc's CastTy
here, how that is supposed to be able to map such nested cases.
Is the problematic case https://godbolt.org/z/jWEoncaP6? Specifically the fact that the load is used on a non-packed struct? (Packed / non-packed shouldn't matter for the call itself.)
I think we should generally avoid passing struct arguments on the LLVM IR level (only use structs for pair returns) -- I believe these will always get scalarized anyway, and just directly passing these in scalarized form is more amenable to optimization and makes the ABI more obvious.
I think we should generally avoid passing struct arguments on the LLVM IR level (only use structs for pair returns) -- I believe these will always get scalarized anyway, and just directly passing these in scalarized form is more amenable to optimization and makes the ABI more obvious.
So, you're saying if the ABI says "pass this struct in a float register and an int register", then we should generate two LLVM IR arguments, of type float
and i32
respectively?
That makes a lot of sense. However, that would be a significant departure from our current argument handling, and requires a fundamental refactor of CastTy
(and really the entire argument handling story, which is all quite entangled currently -- a lot of places assume they know how many LLVM arguments each Rust argument corresponds to).
So, you're saying if the ABI says "pass this struct in a float register and an int register", then we should generate two LLVM IR arguments, of type float and i32 respectively?
Using {float, i32}
as argument type should be enough for that, right?
It might be, but that's not helpful for this issue since we cannot transmute from the packed type to that struct type.
I was very surprised by CastTy
since it contains a bunch of registers but doesn't say how to splat the argument value across those registers. (Also see https://github.com/rust-lang/rust/pull/115654.) I think we need an alternative representation that's more like a list of "take N bytes at offset O and put this into a register of type T". Once we have that (maybe it should be PassMode::Spread
/PassMode::Split
instead of PassMode::Cast
then), we have multiple options for how to encode that to LLVM:
Those might be equivalent on the ABI side? I don't know. Either way we need to emit the appropriate code to deconstruct and reconstruct the Rust value on both ends.
For returns, I think we'll have to use struct types in LLVM function signatures anyway. So we might as well do it everywhere?
So, this passing in registers applies even for packed structs? Sadly that document you linked doesn't mention packed structs at
Sorry for late. The rules for structures also apply to "packed".
@nikic is there any documentation of how the LLVM ABI works when LLVM functions take structs as arguments (and return type)?
Looks like attributes such as zeroext
can only be applied to individual parameters, so if we need to set those for fields of structs passed in registers, we'll need to use separate arguments. But then what do we do with ScalarPair return values? There's a comment in the code currently that we do want to use LLVM pairs for that, I assume that leads to more efficient return value passing (via two registers, or so?).
@rustbot label I-unsound
WG-prioritization assigning priority (Zulip discussion).
@rustbot label -I-prioritize +P-medium
All of these ABIs have in common that they are looking at
layout.fields
to compute the ABI for the function, and they are all getting it wrong, in various different ways. (I previously reported bugs against all of these becauserepr(transparent)
is not properly respected; those bugs have the same cause -- relying onlayout.fields
-- but they are orthogonal.)For instance, this type
on riscv64 gets a PassMode::Cast of
which gets translated to the LLVM type
{ i8, f32 }
, and that's a non-packed LLVM struct. On a call,P
gets transmuted to that LLVM type, which obviously produces garbage since the fields are at different offsets.On sparc64, the type translates into
{ i32, f32 }
which is wrong as well.As another example, this type
on loongarch64 gets a PassMode::Cast of
which is the LLVM type
{ i32, f32 }
, and that's obviously not the right type forS2
.(Caveat: I can't actually run code on these targets, so there's a small chance that I am completely misinterpreting what these
PassMode
mean. I hope that's not the case...)I don't know what the C ABI on those platforms has to say about packed and aligned structs. Currently, we don't even provide ABIs a way to generate a packed LLVM struct for the arguments --
PassMode::Cast
always uses an un-packed LLVM struct.