rust-lang / rust

Empowering everyone to build reliable and efficient software.
https://www.rust-lang.org
Other
96.91k stars 12.52k forks source link

The sparc64, riscv64, loongarch64 `extern "C" fn` ABIs are all wrong when aligned/packed structs are involved #115609

Open RalfJung opened 1 year ago

RalfJung commented 1 year ago

All of these ABIs have in common that they are looking at layout.fields to compute the ABI for the function, and they are all getting it wrong, in various different ways. (I previously reported bugs against all of these because repr(transparent) is not properly respected; those bugs have the same cause -- relying on layout.fields -- but they are orthogonal.)

For instance, this type

#[repr(packed)]
struct P(i8, f32);

on riscv64 gets a PassMode::Cast of

                           prefix: [
                               Some(
                                   Reg {
                                       kind: Integer,
                                       size: Size(1 bytes),
                                   },
                               ),
                               None,
                               None,
                               None,
                               None,
                               None,
                               None,
                               None,
                           ],
                           rest: Uniform {
                               unit: Reg {
                                   kind: Float,
                                   size: Size(4 bytes),
                               },
                               total: Size(4 bytes),
                           },

which gets translated to the LLVM type { i8, f32 }, and that's a non-packed LLVM struct. On a call, P gets transmuted to that LLVM type, which obviously produces garbage since the fields are at different offsets.

On sparc64, the type translates into { i32, f32 } which is wrong as well.

As another example, this type

#[repr(align(8))]
struct Aligned8Int(i32);

#[repr(C)]
struct S2(Aligned8Int, f32);

on loongarch64 gets a PassMode::Cast of

                           prefix: [
                               Some(
                                   Reg {
                                       kind: Integer,
                                       size: Size(4 bytes),
                                   },
                               ),
                               None,
                               None,
                               None,
                               None,
                               None,
                               None,
                               None,
                           ],
                           rest: Uniform {
                               unit: Reg {
                                   kind: Float,
                                   size: Size(4 bytes),
                               },
                               total: Size(4 bytes),
                           },

which is the LLVM type { i32, f32 }, and that's obviously not the right type for S2.

(Caveat: I can't actually run code on these targets, so there's a small chance that I am completely misinterpreting what these PassMode mean. I hope that's not the case...)

I don't know what the C ABI on those platforms has to say about packed and aligned structs. Currently, we don't even provide ABIs a way to generate a packed LLVM struct for the arguments -- PassMode::Cast always uses an un-packed LLVM struct.

bjorn3 commented 1 year ago

Looks like clang will split

struct P {
    int8_t a;
    float b;
} __packed__;

into two arguments on rv64gc with i8 and float respectively as types. Also I think clang will tell LLVM that the struct is packed if necessary on various targets. I also found this function: https://github.com/llvm/llvm-project/blob/d1487670ee1caaf1762c341f9f96cfb07540b5c1/clang/include/clang/CodeGen/CGFunctionInfo.h#L247-L250

RalfJung commented 1 year ago

into two arguments

So that's <{ i8, f32 }>? Or really two separate arguments to the function -- like Rust's ScalarPair? Or does that not even make a difference for the ABI?

bjorn3 commented 1 year ago

In this case it seemed like two separate args were emitted. AFAIK LLVM will internally recursively split structs into separate arguments anyway though if byval is not used. At least that is what it does for Webassembly.

heiher commented 1 year ago

The #[repr(packed)] attribute explicitly defines and constrains the memory layout for types, while the extern "C" for functions requires the function's calling convention to be the same as in the C language. On some architectures with only 32-bit and 64-bit register widths, separating the passing of i8 and f32 can improve access performance. So, as long as parameter passing matches the C calling convention of the target architecture, it is correct. Isn't it?

bjorn3 commented 1 year ago

The issue is with how we turn the value as represented in memory into an argument that can be passed in registers. Currently we assume that the value as represented in memory and the llvm type we use for argument passing have the same layout, which is not the case for aligned/packed structs as the llvm type has a different amount of padding from the real type.

RalfJung commented 1 year ago

The #[repr(packed)] attribute explicitly defines and constrains the memory layout for types, while the extern "C" for functions requires the function's calling convention to be the same as in the C language.

The issue is: when I declare a packed type in C, and a packed type in Rust, then they should be ABI-compatible. But currently they are not. If you take the type P from my example, declare an equivalent type in C, have a C function that takes a struct P as argument, and call it from Rust, then things will explode in the most amazing ways.

heiher commented 1 year ago

Thanks for your explanation. :heart: I tried the test cases you mentioned, and indeed there is an issue when accessing packed structure fields through memory addresses. ~If I didn't make any mistakes, is x86_64 also wrong?~

t.rs:

#[repr(packed, C)]
struct P {
    i: i8,
    f: f32,
}

#[no_mangle]
extern "C" fn test(p: P) {
    let ip = std::ptr::addr_of!(p.i);
    let fp = std::ptr::addr_of!(p.f);
    let i = unsafe { std::ptr::read_unaligned(ip) };
    let f = unsafe { std::ptr::read_unaligned(fp) };
    println!("{:p} {:p} {} {}", ip, fp, i, f);
}

t.c:

struct P
{
    char i;
    float f;
} __attribute__((packed));

extern void test(struct P p);

int
main (int argc, char *argv[])
{
    struct P p;

    p.i = 16;
    p.f = 128.0;

    test(p);

    return 0;
}

run:

rustc --crate-type=cdylib -o libt.so t.rs; gcc -o t t.c -L . -l t; LD_LIBRARY_PATH=`pwd` ./t

x86_64 with #[repr(packed)] and __attribute__((packed)):

0x7ffc31f02520 0x7ffc31f02521 16 128

x86_64 without #[repr(packed)] and __attribute__((packed)):

0x7ffc5a60eaa0 0x7ffc5a60eaa4 16 128

loongarch64 with #[repr(packed)] and __attribute__((packed)):

0x7ffffbf10720 0x7ffffbf10721 16 0.000000000000000000000000000000000000023509886

loongarch64 without #[repr(packed)] and __attribute__((packed)):

0x7ffffba12f90 0x7ffffba12f94 16 128
RalfJung commented 1 year ago

Interesting. x86_64 uses indirect pass mode for packed structs, so... that can't really be an ABI difference. Not sure what is happening. Do the field offsets match on both sides?

heiher commented 1 year ago

Interesting. x86_64 uses indirect pass mode for packed structs, so... that can't really be an ABI difference. Not sure what is happening. Do the field offsets match on both sides?

rustc think x86_64 uses indirect pass mode for packed structs, but actual not.

Pass mode cast:

           args: [
               ArgAbi {
                   layout: TyAndLayout {
                       ty: P,
                       layout: Layout {
                           size: Size(5 bytes),
                           align: AbiAndPrefAlign {
                               abi: Align(1 bytes),
                               pref: Align(1 bytes),
                           },
                           abi: Aggregate {
                               sized: true,
                           },
                           fields: Arbitrary {
                               offsets: [
                                   Size(0 bytes),
                                   Size(1 bytes),
                               ],
                               memory_index: [
                                   0,
                                   1,
                               ],
                           },
                           largest_niche: None,
                           variants: Single {
                               index: 0,
                           },
                           max_repr_align: None,
                           unadjusted_abi_align: Align(1 bytes),
                       },
                   },
                   mode: Indirect {
                       attrs: ArgAttributes {
                           regular: NoAlias | NoCapture | NonNull | NoUndef,
                           arg_ext: None,
                           pointee_size: Size(5 bytes),
                           pointee_align: Some(
                               Align(1 bytes),
                           ),
                       },
                       extra_attrs: None,
                       on_stack: true,
                   },
               },
           ],

Passed by integer register with packed layout:

Breakpoint 1, 0x00007ffff7f694b0 in test () from /home/hev/ws/libt.so
(gdb) i r rdi
rdi            0x43000000f7ffda10  4827858804701911568 { 0x43000000, <0xf7ffda>, 0x10 }
RalfJung commented 1 year ago

I can't read assembly so I can't quite figure out what that gdb snippet at the end tells me. Could you open a new issue for the x86_64 problem? It's different in nature from this one. This issue leads to data loss even when passing packed values between two extern "C" Rust functions (the ABI implemented by Rust doesn't even work for pure Rust programs); the x86_64 problem is an incompatibility with the C ABI (that only surfaces when linking Rust and C code).

heiher commented 1 year ago

For this type:

#[repr(packed, C)]
struct P {
    i: i8,
    f: f32,
}

#[no_mangle]
extern "C" fn test(p: P) {
    let ip = std::ptr::addr_of!(p.i);
    let fp = std::ptr::addr_of!(p.f);
    println!("{:p} {:p}", ip, fp);
}

The callee creates a temporary variable on the stack with the same type and alignment as the parameter, called var1. On LoongArch, the argument passed in is not same as rust type, it is not packed. A temporary variable of the same rust type is also created, called var2. The value of the parameter will be written directly to temporary var1. Finally, use llvm.memcpy to copy var1 to var2.

The issue is that var1 and var2 have different field layout and total size. So it seems that for packed case we need to copy field by field?

https://github.com/rust-lang/rust/blob/cc7a9d69729c9710c01cb70b864e664659626674/compiler/rustc_codegen_llvm/src/abi.rs#L201-L259

RalfJung commented 1 year ago

The first question is, what even is the C ABI for packed types on this target, how do they get passed? Is it indirect, or are the fields spread across registers, and if yes how?

Only after answering that question then can we try to figure out how to encode that ABI in rustc + LLVM.

heiher commented 1 year ago

The first question is, what even is the C ABI for packed types on this target, how do they get passed?

@RalfJung That's a good point.

For the current 'packed' instance, which consists of two fields (one integer and the other floating-point), there is a unique case to consider. In the context of the calling convention of LoongArch, it's important to note that integers are passed via GAR (General Argument Register) while floating-point values are passed via FAR (Floating-point Argument Register).

I'm not sure how other ISAs handle this. From a professional Rust perspective, what approach do you think would be more suitable to start with?

RalfJung commented 1 year ago

So, this passing in registers applies even for packed structs? Sadly that document you linked doesn't mention packed structs at all.

I think this means that we want to generate an LLVM function with type <{ i8, float }> (the angle brackets indicate a packed struct in LLVM). But this would then rely on the LLVM backend logic passing such an argument in registers for this target -- that's code I know absolutely nothing about. Cc @bjorn3 @nikic

If this hypothesis is correct, then we need to extend rustc CastTy to get a packed: bool field, which needs to be used when the CastTy is turned into an LLVM type (that part is easy), and which needs to be set as appropriate by ABI adjustments (that part is hard).

However, even with that we get into trouble for more complicated types, such as

#[repr(packed)]
struct P(i8, f32);

#[repr(C)]
struct Arg(i8, i32, P);

The document seems to say that this is passed the same as #[repr(C)] (i8, i32, i8, f32) in term of registers, but of course the padding is different. I have no idea how to express that in LLVM. Do we need nested types in CastTy, like { i8, i32, <{ i8, float }> }?

This is a concern even without packed; { i8, { i8, i16 } } is not the same as { i8, i8, i16 }. I'm not sure what the plan was with rustc's CastTy here, how that is supposed to be able to map such nested cases.

nikic commented 1 year ago

Is the problematic case https://godbolt.org/z/jWEoncaP6? Specifically the fact that the load is used on a non-packed struct? (Packed / non-packed shouldn't matter for the call itself.)

I think we should generally avoid passing struct arguments on the LLVM IR level (only use structs for pair returns) -- I believe these will always get scalarized anyway, and just directly passing these in scalarized form is more amenable to optimization and makes the ABI more obvious.

RalfJung commented 1 year ago

I think we should generally avoid passing struct arguments on the LLVM IR level (only use structs for pair returns) -- I believe these will always get scalarized anyway, and just directly passing these in scalarized form is more amenable to optimization and makes the ABI more obvious.

So, you're saying if the ABI says "pass this struct in a float register and an int register", then we should generate two LLVM IR arguments, of type float and i32 respectively?

That makes a lot of sense. However, that would be a significant departure from our current argument handling, and requires a fundamental refactor of CastTy (and really the entire argument handling story, which is all quite entangled currently -- a lot of places assume they know how many LLVM arguments each Rust argument corresponds to).

bjorn3 commented 1 year ago

So, you're saying if the ABI says "pass this struct in a float register and an int register", then we should generate two LLVM IR arguments, of type float and i32 respectively?

Using {float, i32} as argument type should be enough for that, right?

RalfJung commented 1 year ago

It might be, but that's not helpful for this issue since we cannot transmute from the packed type to that struct type.

I was very surprised by CastTy since it contains a bunch of registers but doesn't say how to splat the argument value across those registers. (Also see https://github.com/rust-lang/rust/pull/115654.) I think we need an alternative representation that's more like a list of "take N bytes at offset O and put this into a register of type T". Once we have that (maybe it should be PassMode::Spread/PassMode::Split instead of PassMode::Cast then), we have multiple options for how to encode that to LLVM:

Those might be equivalent on the ABI side? I don't know. Either way we need to emit the appropriate code to deconstruct and reconstruct the Rust value on both ends.

For returns, I think we'll have to use struct types in LLVM function signatures anyway. So we might as well do it everywhere?

heiher commented 1 year ago

So, this passing in registers applies even for packed structs? Sadly that document you linked doesn't mention packed structs at

Sorry for late. The rules for structures also apply to "packed".

RalfJung commented 1 year ago

@nikic is there any documentation of how the LLVM ABI works when LLVM functions take structs as arguments (and return type)?

Looks like attributes such as zeroext can only be applied to individual parameters, so if we need to set those for fields of structs passed in registers, we'll need to use separate arguments. But then what do we do with ScalarPair return values? There's a comment in the code currently that we do want to use LLVM pairs for that, I assume that leads to more efficient return value passing (via two registers, or so?).

Jules-Bertholet commented 11 months ago

@rustbot label I-unsound

apiraino commented 11 months ago

WG-prioritization assigning priority (Zulip discussion).

@rustbot label -I-prioritize +P-medium