rust-lang / rust

Empowering everyone to build reliable and efficient software.
https://www.rust-lang.org
Other
96.66k stars 12.49k forks source link

Tracking issue for RFC 2137: Support defining C-compatible variadic functions in Rust #44930

Open aturon opened 6 years ago

aturon commented 6 years ago

This is a tracking issue for the RFC "Support defining C-compatible variadic functions in Rust" (rust-lang/rfcs#2137).

Steps:

Unresolved questions:

charmoniumQ commented 5 months ago

We're talking about slightly different things.

Function we are defining Function we are calling
My request variadic (e.g., `printf_wrapper`) variadic (e.g., `printf`)
Your request variadic (e.g., `printf_wrapper`) not variadic (e.g., `vprintf`)

In GCC, forwarding variadic functions to a variadic function can be done using __builtin_apply. I don't think there is a Rust equivalent of this.

pervognsen commented 5 months ago

We're talking about slightly different things. Function we are defining Function we are calling My request variadic (e.g., printf_wrapper) variadic (e.g., printf) Your request variadic (e.g., printf_wrapper) not variadic (e.g., vprintf)

In GCC, forwarding variadic functions to a variadic function can be done using __builtin_apply. I don't think there is a Rust equivalent of this.

Ah. It doesn't even look like clang supports __builtin_apply, which does not make me hopeful there would be a 1:1 equivalent in LLVM. Since the '...' in 'args: ...' currently stands for a platform/ABI-specific opaque VaListImpl, you can get 'naive forwarding' to pass Rust's type checks and it misleadingly passes basic tests as long as you stay within the ABI argument limit that stays in registers, but it's revealed as incorrect when you exercise a longer argument list: https://play.rust-lang.org/?version=nightly&mode=release&edition=2021&gist=3f7e586b0d35e50626cf290097731d6a. I'm assuming you got at least this far in your own investigation; I'm just leaving this here in case anyone else comes across this thread.

bjorn3 commented 3 months ago

Something to note in the unresolved questions:

libcore currently hard codes the layout of va_list that llvm uses for each individual target. This probably won't work with other backends. How should we handle the target and backend dependent VaListImpl layout?

lolbinarycat commented 3 months ago

Bikeshed: the ellipsis in mut args: ... has no precedent anywhere else in the language. The syntax could be something more reminiscent of pattern matching, like mut args @ ...

untrue, it is already used when declaring extern variadic functions.

https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=724901e4535cd7c5981de0eff52a978f

bjorn3 commented 3 months ago

For declaring variadic functions the ... is not in the type position like with mut foo: ..., instead it is a special case only valid inside function declarations.

lolbinarycat commented 3 months ago

presumably the syntax for defining variadic functions would also be a special case.

regardless, my point is that it does have precedent, and indeed is already associated with variadics.

beetrees commented 3 months ago

Something to note in the unresolved questions:

libcore currently hard codes the layout of va_list that llvm uses for each individual target. This probably won't work with other backends. How should we handle the target and backend dependent VaListImpl layout?

@bjorn3 The layout of va_list is part of the C ABI for a target, so it doesn't vary by backend. For instance, on x86-64 Unix, it is defined to always be:

typedef struct {
    unsigned int gp_offset;
    unsigned int fp_offset;
    void *overflow_arg_area;
    void *reg_save_area;
} va_list[1];

in the ABI specification.

bjorn3 commented 3 months ago

For C-SKY the abi specification doesn't seem to dictate any specific layout for va_list. It suggests a possible implementation for va_start and va_arg in section 2.2.4, but does not say that this is the only valid implementation. And the Hexagon abi specification doesn't even mention va_list, va_start or va_arg. It only explains the calling convention in section 5.2.

You are right that many other abi specifications do explicitly mention the layout. I wasn't aware of this.

References: C-SKY: https://github.com/c-sky/csky-doc/blob/master/C-SKY_V2_CPU_Applications_Binary_Interface_Standards_Manual.pdf Hexagon: https://lists.llvm.org/pipermail/llvm-dev/attachments/20190916/21516a52/attachment-0001.pdf

beetrees commented 3 months ago

I see what you mean. In the cases of C-SKY and Hexagon, this seems to be a case of the ABI specification under-specifying; as va_list can be passed between C functions (such as when calling vprintf), compilers that can link to each others code (e.g. clang and gcc) must agree on the representation of va_list.

workingjubilee commented 2 months ago

On May 28, 2019 2:34:10 PM PDT, Alexander Regueiro @.***> wrote: > As far as I understand, the thing is that f32 is being promoted to a double (f64) when passed as a vararg. Therefore there's no actual possibility to retrieve an f32 value from a va_list because of the promotion. This is what I understood too. In that case, there's no possibility of retrieving anything other than a C int, unsigned int, or double... so why the impls for the other types? Precisely because those impls are supposed to handle the promotion behavior correctly.

I do not believe this should be done. The Rust compiler does not even correctly check passing varargs currently when generic types become involved, see https://github.com/rust-lang/rust/issues/61275 about that. When combined with accepting these types, this would make it very difficult to reason correctly about the type conversions involved, because we would be adding in our own special rules on top of the already Byzantine rules for C's variable arguments. This would make it much harder to correctly add support for passing e.g. structs over variadic arguments, down the line[^0]. I don't believe our current implementations for variadic arguments are adequately or thoroughly tested. We have had a bad habit of not testing many edge-cases that could potentially get around naive checks.

In other words, if you extract a different type from the va_list than the types that are passed into the variable arguments (barring some very specific equivalencies that are preserved between types of the same size), it's UB in C. It should be UB in Rust, too.

I have found way too many bugs in how we handle low-level compilation details, including ABI problems, to believe this is a good idea.

[^0]: This is partly "because we might accidentally introduce behaviors that conflict with other ABI handling details", and also partly because our energies for handling ABI correctly are extremely limited.

workingjubilee commented 2 months ago

It also seems incoherent to provide a demotion behavior for all of those types and then refuse to demote f32, I would think, as there is an expected lossless translation from f32 to f64 for non-NAN bitpatterns and a lossy translation from f64 to f32 for non-NAN bitpatterns that provides the rounded inverse of that. Importantly, if someone passed an f32, they'd get the same f32 back (except NANs might get weird, but we can attempt to define the promotion/demotion behavior to preserve NAN bitpatterns in the case where there's no optimizations).

Or f16, for that matter.

I still think we shouldn't do this because it invites confusion and misunderstanding what the Rust compiler is doing (because people don't really understand what the C compilers do, either).

workingjubilee commented 2 months ago

@programmerjake has informed me we cannot even rely anymore on things like "all C types will get promoted to at least a certain size" (even relative to the C ABI!) casting further doubt on that idea.

programmerjake commented 2 months ago

yes, afaict _BitInt(8) doesn't get promoted to int but instead gets passed as an 8-bit integer (verified using clang -emit-llvm), this also applies to FP types -- _Float32 doesn't get promoted to double (didn't check what clang does). just the older C types get promoted: e.g. signed char to int and float to double.

RalfJung commented 2 months ago

In other words, if you extract a different type from the va_list than the types that are passed into the variable arguments (barring some very specific equivalencies that are preserved between types of the same size), it's UB in C. It should be UB in Rust, too.

Definitely fully agreed. Whether an f32 gets promoted to something else or not should not be the programmer's concern. Especially if that is already how C works (if the argument has type f32, it needs to be extracted from the va_list at type f32), then we should do the same in Rust.

It follows that the entire discussion about what gets promoted is unnecessary. It's the backends responsibility to handle this correctly.

programmerjake commented 2 months ago

Whether an f32 gets promoted to something else or not should not be the programmer's concern.

I think Rust shouldn't do promotions for f32 since there are multiple corresponding C types: float and _Float32. float gets promoted but _Float32 does not, so I think the programmer just has to manually promote to f64 if that's what the function expects, Rust doesn't know wether float or _Float32 is what the API expects so can't do promotions for you.

RalfJung commented 2 months ago

As an example of the kind of "fun" one can have with this, see https://github.com/rust-lang/rust/issues/71915. The man page says the argument has type mode_t, which is unsigned short, but the actual function signature is a variadic so on the ABI level the argument is passed as int... and if you pass this as a mode_t from Rust you risk UB.

programmerjake commented 2 months ago

maybe a good solution would be to warn when passing or accepting types that are usually subject to promotion since _BitInt(N) and _Float32 are much rarer than char, short, and float. when you actually need to pass/accept _BitInt(8) or _Float32 you can use an attribute to silence the warning, maybe #[uncommon_c_types]:

unsafe extern "C" {
    unsafe fn f(c_int, ...);
}

pub g(v: f32) {
    unsafe {
        f(0, #[uncommon_c_types] v)
    }
}
lolbinarycat commented 2 months ago

why are you suggesting a special attribute for silencing a warning, we already have allow. such a warning does seem like a good idea though, mixing up u8 and char in signatures does seem like it could be problematic.

programmerjake commented 2 months ago

why are you suggesting a special attribute for silencing a warning, we already have allow.

because iirc a lot of people don't want rustc to emit warnings on code when it's the only valid way to implement something and you can't rewrite it to be warning-free.

lolbinarycat commented 2 months ago

because iirc a lot of people don't want rustc to emit warnings on code when it's the only valid way to implement something and you can't rewrite it to be warning-free.

this is still doing that, just providing an alias for allow(whatever) doesn't change that. i would argue such a warning should probably be part of clippy instead. also the attribute feels badly named, since it doesn't mention the important part, is which is that this has to do with variadics and promotion.

workingjubilee commented 2 months ago

Definitely fully agreed. Whether an f32 gets promoted to something else or not should not be the programmer's concern. Especially if that is already how C works (if the argument has type f32, it needs to be extracted from the va_list at type f32), then we should do the same in Rust.

To be clear, @RalfJung, this is actually part of the issue: in C, you are expected to have memorized the arcane promotion rules, and thus the author of the variadic function has to specify va_arg to name exactly the type that it has been promoted to. To do otherwise is UB.

This is because the C promotion rules actually are not about the ABI. As demonstrated by _BitInt(8), it is legal, in some cases, to pass smaller-than-c_int args. Instead, it has to do with C source code semantics: all of these transitions are handled by clang's frontend, before it ever reaches LLVM!

So, I think that in Rust, we should not try to squirm out from under this. We should make apparent the bizarreness of the results.

The example you cite, in any case, actually would not compile if you had tried to make the change: https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=26aa043833877711d0aba4e26abdebae

programmerjake commented 2 months ago

this is still doing that, just providing an alias for allow(whatever) doesn't change that.

I guess...

i would argue such a warning should probably be part of clippy instead.

I strongly think it should be in rustc because >99.9% of the time the warning is correct and C APIs are almost always documented as you just pass the smaller type, but rust needs the argument to be cast to the promoted type.

rustc already has lints that will be incorrect more often than this, e.g.: temporary_cstring_as_ptr

also the attribute feels badly named, since it doesn't mention the important part, is which is that this has to do with variadics and promotion.

yeah, picking a different name is fine with me.

RalfJung commented 2 months ago

in C, you are expected to have memorized the arcane promotion rules, and thus the author of the variadic function has to specify va_arg to name exactly the type that it has been promoted to. To do otherwise is UB.

Fun!

The example you cite, in any case, actually would not compile if you had tried to make the change: https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=26aa043833877711d0aba4e26abdebae

Ah, neat. I had not seen that as I was just working on the other end -- implementing open in Miri. The error message itself is not very clear about why I need to do this though... Interestingly, the detailed explanation for that error then says

Certain Rust types must be cast before passing them to a variadic function, because of arcane ABI rules dictated by the C standard.

which seems to contradict a little your statement that this is not about ABI.

also the attribute feels badly named, since it doesn't mention the important part, is which is that this has to do with variadics and promotion.

Note that Rust also has a concept called "promotion", and it is entirely unrelated. So we need to be careful with how we use that term in documentation.

workingjubilee commented 2 months ago

I don't think the expanded error message is correct, no.

It was written by someone who had far more SAN remaining than me.