Closed nikomatsakis closed 5 years ago
C++ member function pointers (which allow for dynamic dispatch) are typically, but not necessarily, larger than data pointers. If Rust wished to support calling them directly by way of an extern "cpp-member" fn
(say), then it would need larger pointers. I think this possibility (or the possibility of some other language doing similarly) is enough to say that the size of fn
is therefore ABI-specific.
I don't know the background for the rules in C specifically that function pointers may be larger than data pointers to know if we can say that it's safe for the C ABI. However, presumably transmute
would throw a compiler error if such a platform were ever introduced and erroneous code were used; it only be transmute_copy
that could present a problem.
I agree that Option<extern "abi" fn()>
should always optimize for the null pointer as None
.
Pointers to member functions are separate from regular function pointers in the C++ type system and I see no reason why we should pretend they're the same in Rust, especially if it requires weakening otherwise-plausible guarantees we could give.
A better motivation for making no guarantees about would be if some architectures had differently-sized address spaces for code and sizes, or if a platform ABI added extra metadata to function pointers that isn't there for other pointers, e.g., a tag to support enforcement of control flow integrity.
They're definitely different, and they definitely do not use any of the existing ABIs. It's not the case that we support ABI polymorphism, though (except via Fn
traits, but at that point we've already mostly stopped caring since an Fn
object could have arbitrarily large size), which when I think about it makes it seems to make it a bit silly to insist that all fn
pointers are necessarily convertible to usize
: their use is likely going to be ABI-specific.
Today, each function pointer refers to a specific free function that is declared with the same ABI string as the function pointer carries -- extern "foo" fn bar() {}
can be referred to with an extern "foo" fn()
. The ABI string, on functions as on function pointers, indicates how parameters and return values are passed, which registers get saved by whom, and other details of how calls and the function prologue and epilogue are codegen'd, but not how the function pointer is represented or where the function is placed in memory.
This means that, while we can't call any function with any ABI we like, it's not a total wild west either. In the past we have considered generating shims that adapt from one ABI to another (and have in fact done so in the past to codegen Rust functions with C ABI). Even exotic ABIs like ptx_kernel
or msp430_interrupt
are just selecting different codegen for functions and calls to them, not fundamentally changing what a function pointer means. This status quo does not necessarily have to prevail, and as I said I could see uses for extra data attached even to pointers to free functions (so I am not really arguing that fn pointers should be guaranteed to be laid out like usize
), but today ABI strings cause only quite limited and well-understood variation.
A C++ pointer to member function, on the other hand is conceptually quite different from free functions and pointers to them. It's arguably even orthogonal to calling convention, since various compilers allow declaring member functions with different calling conventions (so e.g., you might have a member function that uses __fastcall
).
Do extern "C" fn()
and fn()
have the same type?
They are separate types. fn()
is short for extern "Rust" fn()
and fn pointers with different ABI strings are different types.
I definitely think C++ member function pointers are out of scope for this discussion. Rust's function pointers are analogous to a C function pointer (eg., void (*)()
) -- they don't carry any "extra data" (and they kind of can't, since they don't have a lifetime bound, for better or worse).
I believe that we should declare — at minimum — that an extern "C" fn()
is represented in the same was as the corresponding C function pointer type (void (*)()
), except that it cannot be NULL and must be valid to call (because safe code can call it).
This implies also that Option<extern "C" fn()>
is fully representation compatible with void (*)()
.
I thenk plenty of unsafe code in the wild relies on this (as @wycats and @sgrif can probably attest; they happen to be two people who I've talked to about this in the past).
I thenk plenty of unsafe code in the wild relies on this
It's what bindgen generates for anything that takes a function pointer as an argument, so I think that's reasonable. :) (I can say for sure that Diesel relies on Option<extern "C" fn(...) -> ...>
's representation)
(and they kind of can't, since they don't have a lifetime bound, for better or worse)
Why would they need a lifetime bound? IIRC they carry at most an offset into a vtable which does not depend on any object lifetimes.
(because safe code can call it).
How can I construct a extern "C" fn()
that I can call in safe code ? AFAIK extern "C" fn()
only accepts functions with extern "C"
ABI. These functions are always unsafe
, so one can't make a safe extern "C" fn()
point to them (only extern "C" unsafe fn()
).
How can I construct a extern "C" fn() that I can call in safe code ?
You just define it: https://play.rust-lang.org/?gist=23fca1fa4d23cb71489a1733d7e6de8b&version=stable&mode=debug&edition=2015
Bindings to external symbols are always unsafe functions since you're asserting you got the signature right.
I want to call out a comment by @rkruppe from the discussion about integer types:
A more general point regarding extremely niche implementation choices such as non-octet-bytes or NULL-at-nonzero-address: people are going to write code that relies on assumptions that are true on every platform they have ever heard of, and for good reason, as it simplifies their code at effectively no loss of portability. We can't prevent that, nor should we IMO, at most we could tell these people they are relying on implementation-defined behavior, which just makes it a de facto standard rather than a de jure one.
I find this very well put, and it I think definitely applies here, in terms of e.g. whether we commit to a extern "C" fn
being compatible with a usize
and so forth.
It seems like we ought to settle -- perhaps -- more generally on a policy in such cases. I feel like it's worth identifying a "default compatibility" profile that guarantees portability across all "major architectures", but perhaps identifying concerns that may apply to more esoteric architectures.
https://github.com/rust-lang/unsafe-code-guidelines/blob/master/reference/src/layout/function-pointers.md addresses this (otherwise please re-open).
Discussing the representation of
extern "abi" fn(..)
types:usize
?Option<extern "C" fn()>
guaranteed to be equivalent to a "C fn pointer" representation?