rust-lang / unsafe-code-guidelines

Forum for discussion about what unsafe code can and can't do
https://rust-lang.github.io/unsafe-code-guidelines
Apache License 2.0
659 stars 57 forks source link

Specification of FFI #421

Open chorman0773 opened 1 year ago

chorman0773 commented 1 year ago

What are the operational semantics of the following rust program:

extern "C"{
   fn foo();
}

fn main(){
    foo()
}

Two specific questions are asked here:

  1. What is the behaviour of calling a "foreign" function that is defined and exported (#[no_mangle] or #[export_name]) in the same program and "accessible" via a (potentially-indirectly) linked rlib or dylib a. A subquestion is what is the behaviour of calling a "foreign" function defined and exported in Rust that is linked via a staticlib/cdylib
  2. What is the behaviour of calling a "foreign" function that isn't defined in Rust (but in some other, potentially assembly) language.

The answer to question (1) is obvious: It's equivalent to a call to that function. (1)(a) and (2) are a bit more interesting.

A previously discussed answer is that the the AM performs an implementation-defined AM operation, paramaterized by the link_name of the external function, and if it matches the export_name of an exported function defined by the program, it behaves as an equivalent operation to a call to that function. Does this handle every case? What about signature mismatch in the (1), (1)(a), and (2) cases? What happens if it's exported from a dylib, but also defined in a staticlib (either rust or non-rust)? Or defined in two dylib or a dylib+cdyib.

@rustbot label +S-pending-design +A-ffi

RalfJung commented 1 year ago

I think the "opsem" in the title is a misnomer. There isn't an operational semantics for FFI (unless you want to fix the language on the other side). The interaction with the outside world is necessarily going to be axiomatic in style.

A previously discussed answer is that the the AM performs an implementation-defined AM operation

That's not how I would put it.

Basically for every FFI call (and inline asm block, this is closely related to https://github.com/rust-lang/unsafe-code-guidelines/issues/422) the programmer picks a corresponding "step" (in quotes because there's a lot that can happen here, with non-determinism and I/O and whatnot, it's a very big step) on AM states and then they have the proof obligation to show that

chorman0773 commented 1 year ago

That's not how I would put it.

This was at least one of the ways it was discussed previously. I would prefer it would use the direct implementation-defined correspondance because: a) It actually makes it fairly direct when you call rust code from rust code, if you could have made the call without using FFI. We don't even have to leave the opsem world, and we don't have to decide how a user goes about proving that a function written in rust performs the operation that function normaly performs on the AM. b) It emphasises the distinction between inline asm (which is more likely to go with a proof model) and function calls, being that implementations are very much allowed to (and very much do) analyze FFI calls.

chorman0773 commented 1 year ago

And note that the analysis may not even require LTO: Under lccc's abi definition it's possible to inline a #[no_mangle] function that's also declare #[inline] even via an external call, if the definition is brought into the current CGU/TU in some way, usually by the IR file that defines the inlineable version being included by a call to another inlineable function.

RalfJung commented 1 year ago

This was at least one of the ways it was discussed previously.

Maybe it has been, but I have been using the description in https://github.com/rust-lang/unsafe-code-guidelines/issues/421#issuecomment-1597313701 for years now, so that is also surely one of the ways it has been discussed for a long time.

Saying "it's implementation-defined" is basically useless as far as I am concerned, then we still have to figure out what we will say for rustc. Our job is to produce something akin to https://github.com/rust-lang/rfcs/pull/3355, so we're not done when we say "implementation-defined", we have a primary implementation we care to specify. We still need to work out what that spec looks like, and then we can decide whether for some reason we want to allow other implementations to define a different spec (but I see no strong motivation for doing that).

The equivalent of doing xlang-LTO is to run optimizations on the assembler level, on the final object file. Both of these are totally legal so I am not sure a fundamentally different approach is needed for asm and FFI. The justification for xlang LTO is that LLVM knows both the C and Rust AM (or rather, both of them have been refined into LLVM IR), and so now it is in control of both sides of this interaction and can reason about what happens across that interaction.

chorman0773 commented 1 year ago

We still need to work out what that spec looks like, and then we can decide whether for some reason we want to allow other implementations to define a different spec (but I see no strong motivation for doing that).

IMO, this is necessary because of cross-lang LTO (and in lccc, this is just "LTO", there's not much special going on), and differences in linking (lccc for example uses slightly different search directories on x86_64-unknown-linux-gnu than rustc does, because it assumes that unknown means unknown, rather than assuming it means either unknown or pc), both static linking and dynamic linking. There are also definitely differences in how different targets and compilers will react in some of the more interesting cases, and especially for Rust-Rust FFI, we can't just say that extern "Rust" matches regardless of compiler (though rustc could very well specify that for rustc, as could others).

Saying "it's implementation-defined" is basically useless as far as I am concerned

Keep in mind that this is constrained "implementaton-defined", assuming we constrain in the Rust-Rust case, at least within a crate graph.

The equivalent of doing xlang-LTO is to run optimizations on the assembler level, on the final object file. Both of these are totally legal so I am not sure a fundamentally different approach is needed for asm and FFI.

(The use of xlang-LTO here is slightly confusing to me so I'm going to use cross lang lto) While you can equivalently run those optimizations at an assembler level, IMO the current wording in the reference would prohibit running it together on generated assembly/machine code, and assembly/machine code from inline-assembly, since the compiler is explicitly enjoined from assuming the contents of the assembly block match what is actually executed at runtime. The difference with FFI is that no such prohibition exists on FFI calls, so cross-lang lto is allowed. This is why I believe there is necesarily a fundamental difference in how they are specified.

CAD97 commented 1 year ago

I think we actually agree on semantics here, and just disagree on how exactly to communicate them. I'm going to attempt to restate the definition such that it satisfies both Ralf and Connor's intuitions:

Rust-Rust FFI

This is a special case of Rust-Unknown FFI, documented purely as an expository measure. There is no specialization of behavior.

Rust-Unknown FFI


Cross-language and/or link-time optimization is necessarily done over the domain of some shared IR that both sides lower to. Optimizations that preserve observable behavior in the shared IR necessarily preserve behavior of both higher-level languages, as a correctly lowered translation is at least as constrained w.r.t. behavior as the pre-lowering representation (and usually more constrained).

To note, though, optimizing at the level of target bytecode/assembly is difficult, and requires some assumptions beyond that provided purely by the target semantics, because practically everything about assembly is theoretically observable. At a bare minimum for a von-neuman architecture target, you need to assume that the code segment is never accessed as data. Potentially the only safe transformation is relocation.

I think that addresses how FFI to raw asm (whether inline or not) is different from FFI to another higher level language. Code segments implemented in raw assembly can't be optimized without a proof that there's absolutely no possible way on the target machine to differentiate between the written and optimized versions. (The exact same as for high level languages, just in a domain where much more is potentially observable.) Any cross-language optimization that doesn't uphold that proof burden is simply incorrect.

You can of course claw back some opportunity for optimizarion by defining the target behavior (and thus the domain where optimizations are done) as being less constrained than "what the hardware does," but any such redefinition of course apply to any developer-authored assembly as well.

Saying "it's implementation-defined" is basically useless as far as I am concerned

Even for solely a single rustc, "implementation defined" has some useful meaning, though perhaps it would be better called "target defined." There are definitely things which behave in a reliable manner per target, but in a different way on different targets. Anything related to how the Rust AM gets lowered is "implementation defined," even though we absolutely do still guarantee some properties after lowering (e.g. that C-compatible FFI operates in a C-compatible fashion).

chorman0773 commented 1 year ago

Potentially the only safe transformation is relocation.

Note that there are also link relaxations, which are a form of optimization. These are also done by the linker, though, hense the name.

RalfJung commented 1 year ago

Ah sure, we'll have to say something about linker name resolution I guess. I'll stay out of that discussion since I know basically nothing about this subject. I would expect that we specify something based on what rustc does; having other implementations use a different algorithm here seems quite problematic -- but this is so far outside my realm I really can't contribute much here.

I was mostly concerned with the abstract/theoretical code reasoning aspects of FFI: linking with foreign code, potentially code written in a different language. And here I think @CAD97 captured my intent quite well: we shouldn't have to say anything special like "assembly code cannot be optimized by the compiler" or "cross-lang inlining is permitted"; this should all fall out of higher-level reasoning principles such as "each part of the program is executed according to the spec of the language it is written in". I expect the most tricky part here will be ironing out just which part of the "Rust AM / assembly" correspondence implemented by rustc (i.e., the relational invariant maintained throughout the execution of any Rust program which always tracks which Rust AM state corresponds to the current physical machine state) we want to make any stable promises about. This will be heavily target-dependent and includes things like how the stack is laid out and which registers have special roles (such as a stack pointer). Saying much here is challenging since things lime inlining and outlining can make the low-level target stack look quite different from the Rust AM stack.

chorman0773 commented 1 year ago

I would expect that we specify something based on what rustc does; having other implementations use a different algorithm here seems quite problematic -- but this is so far outside my realm I really can't contribute much here.

Name resolution rules at a minimum will depend on the object format. If the algorithm is specified too precisely, then you'd effectively lock things to ELF+COFF+WASM. However, there are definitely rust-level decisions that can affect this - for example, whether #[no_mangle] functions in dylibs get default or protected visibility. I highly doubt that there's a good reason to limit choice in this regard either way. Name mangling is also implicated, because users could reverse engineer mangled names from the compiler (or read documentation on the matter) and those symbols will be resolved differently between compilers that use different algorithms. I'm sure there are other subtle ways there can be a difference, beyond the obvious one of where names are looked up in the first place depending on what libraries are linked, and different implementations may pull in different libraries for different reasons.

There are also subtle differences between both static and dynamic link editors, and even configuration of the same linkers. Linking with or without setting an option that generates a DT_BIND_NOW tag, or using the LD_BIND_NOW environment variable determines whether libraries loaded by dlopen throughout a program can affect name resolution. Specifying every single possible combination of behaviours accross the already long list of link editors is likely a non-starter.

chorman0773 commented 1 year ago

As a note about how different rust-level implementations might differ in symbol lookup, lccc's abi trivially endorses the calls in the following program as semantically equivalent (and can even do so operationally to an extent, as it does not leave the "rust domain" to perform the calls):

// crate foo: Compile without any `-C metadata` so as to get the crate name as the stem. Note that this behaviour is not guarantee by the abi, but a stem is not added without any `-C metadata` arguments by the lccc rust frontend

// No inline attribute, this is defined in the crate somewhere
pub fn bar(x: u32) -> i32{
    x as i32
}

// crate bar

extern "Rust"{
     fn _ZN3foo3barEu(x: u32) -> i32;
}

fn main(){
     assert_eq!(foo::bar(u32::MAX), unsafe{_ZN3foo3barEu(u32::MAX)});
}

In fact, a rule used by lccc for that implementation-defined choice might be to say: "Every function defined in rust that does not have the #[no_mangle] or #[inline] attribute (other than #[inline(never)]) and is either declared pub or is accessible to the current module may be called within that module via it's external name as given by https://lccc.lcdev.xyz/lcrust/abi/v0#name-mangling" (The lccc standard library in fact relies on this rule in several places, notably the implementation of Caller::location()).

This would be a different rule for name resolution than rustc has, and the program is more than likely going to create a link error on rustc. Thus, at the very least, the set of symbols exported from rust and their link names are unspecified.

RalfJung commented 1 year ago

Isn't that the kind of situation we want to avoid -- programs working with some compilers but not others? That seems like a compatibility / ecosystem split hazard.

chorman0773 commented 1 year ago

Isn't that the kind of situation we want to avoid -- programs working with some compilers but not others? That seems like a compatibility / ecosystem split hazard.

It's also a fundamental result of having different abis. I think it would be a far worse scenario if the functions mangled the same but were quite different in call abi. There's not a real diference with rustc here, except that lccc is more explicit about allowing this kind of name "guessing" (or computation). If I pull the discriminator from a crate compiled with v0 mangling, I can write the calls in that as well.

There are going to be other differences also, brought about by different linking requirements. The main lccc driver, for example, will likely link its 4 main runtime support libraries whenever it can find them, so it's very much possible to call e.g. __fadd_posit_32e2 without an explicit argument. This occurs so that programs written in XIR or in languages with language or extension support that use features requiring those libraries can be linked into rust programs w/o issue. In some cases, parts of lccc itself get linked in (proc-macro uses libxlang-interface so it can do things like allocate memory it can pass to the host).

In the mangled name case, this would be fairly difficult to do by accident, imo, and it's already trivial to write a program that would work on one, but not another:

#![feature(link_llvm_intrinsics)]

extern "C"{
    #[link_name = "llvm.debugtrap"]
    fn debug_trap();
}

fn main(){
    unsafe{ debug_trap()}
}

Works with rustc-llvm and no other compiler I'm aware of, including other codegens of rustc.

RalfJung commented 1 year ago

Why would lccc allow such kind of name guessing to begin with? If rustc forbids it, IMO other implementations should also forbid it (i.e., it falls in the same bucket as repr(Rust) layout guessing).

chorman0773 commented 1 year ago

Why would lccc allow such kind of name guessing to begin with?

It doesn't allow it insofar as it doesn't write down an explicit rule that says that you're allowed to "guess" the name. However, it does write down quite extensively it's rules for computing the name, with other implementations can use to opt-in as compatible. Following that, a programer could follow that document just as well as a later version of lccc or a compatible rust implementation, and compute the name manually. In fact, as I mentioned, part of lccc's standard library uses this to implement obtaining the caller location w/o an intrinsic, by exploiting that knowledge of name mangling as well as the abi assigned to certain functions.

As long as you have the information needed, you as a rust programmer can do just about anything a rust compiler can in terms of linkage.

If rustc forbids it, IMO other implementations should also forbid it (i.e., it falls in the same bucket as repr(Rust) layout guessing).

To forbid it in this way would require active hostility, including to compiler-generated code that links to code produced by lccc (in the extreme, it would require variance between versions, which is explicitly an anti-goal of the project). Rustc even does not actively forbid it, per se, rather it doesn't endorse it either explicitly or implicitly. Even moreso than guessing though, I could pull the crate metadata and export table and find the name of an exported function myself and call it. As long as I never recompile the rlib/dylib, it's functionally identical to an explicit rule giving me a stable name (And indeed it is fairly similar because upgrading lccc can break mangling/abi as upgrading rustc can, the difference is that as the lccc developer, I would choose not to do that as much as possible).

RalfJung commented 1 year ago

However, it does write down quite extensively it's rules for computing the name, with other implementations can use to opt-in as compatible.

It can always say that those rules are subject to change any time, and may not be exploited by programmers.

In fact, as I mentioned, part of lccc's standard library uses this to implement obtaining the caller location w/o an intrinsic, by exploiting that knowledge of name mangling as well as the abi assigned to certain functions.

Sure, the the standard library can be tied to a particular compiler version.

To forbid it in this way would require active hostility, including to compiler-generated code that links to code produced by lccc (in the extreme, it would require variance between versions, which is explicitly an anti-goal of the project). Rustc even does not actively forbid it, per se, rather it doesn't endorse it either explicitly or implicitly.

I am not suggesting to actively make this not work, I am suggesting to specify it as not guaranteed. That's exactly what we do with several other aspects of the language, I don't see any reason why this would be treated any different.

chorman0773 commented 1 year ago

It can always say that those rules are subject to change any time, and may not be exploited by programmers.

It does indeed:

For all purposes, you can rely on this ABI or a future version when compiling using lccc. Other implementations of rust may adopt this specification as well, at their option.

"or a future version" is the key point here. During an update of the lccc rust frontend, a new version of the abi can be published and the frontend switched to use the new version by default, and pretty much the whole document is "up for grabs" in terms of changes (including mangling, but also layout and call abi). Of course, if you know which version of the abi is in effect (and there are several ways to both check this and to require it via a #[feature]) you as a programmer can rely on that version. In fact, this is done in the host part's of lccc, namely xlang-abi, which will switch out much of its dyn Trait-related code with transmutes knowing the abi to be correct.

I would also note that my example does require a specific scenario, but could be generalized in a similar manner to what you could do with v0 mangling (and, to an extent, rustc-legacy mangling).

In the general case, the mangled_name of a crate is nominally unspecified, but you can reach into the crate header in the manifest file, and gather a lot of information about it: The abi version, as well as the mangled_name field. I could then programmaticaly construct the correct name to call ::foo::bar with. On rustc, you could do the same by pulling the crate discriminator (or pulling the information needed to compute it, I don't know exactly how its done, but I imagine it's at least nominally stable on same-version to support linking to rlibs and dylibs) and go from there. The rules are just as well-defined for v0 mangling as for lcrust's itanium-extended mangling.

chorman0773 commented 1 year ago

In any case, I could spend a months worth of T-opsem meetings enumerating known differences between rustc and lccc in linking. I could also spend 2 months worth of T-opsem meeting enumerating differences between x86_64-pc-linux-gnu-rustc-llvm-bfd and x86_64-pc-linux-musl-rustc-llvm-lld, as well as every environment variable, flag, and config file the user can modify to alter static and dynamic link editor behaviour. So we could spec every degree of freedom necessary to make all of the valid, but I think the easier solution is to simply say that symbol resolution is "implementation-defined".

RalfJung commented 1 year ago

you can rely on this ABI or a future version

If the intention of this statement is "you cannot rely on this ABI (since it can arbitrarily change in the future), then that is a strange way to express that.

I think the easier solution is to simply say that symbol resolution is "implementation-defined".

I think the easier solution is to say that it is unspecified, except for a list of guarantees -- just like what we do with layout.

"implementation-defined" inherently implies code that is fully well-defined and guaranteed to work with some implementations but not with others. We want to avoid that, so I don't think we should ever make anything in Rust implementation-defined. gcc-rs has been pretty clear that they don't intend to officially support anything that is not also officially allowed in rustc; I think we should expect all implementations to follow that.

chorman0773 commented 1 year ago

If the intention of this statement is "you cannot rely on this ABI (since it can arbitrarily change in the future), then that is a strange way to express that.

The intent is to express that if you, the user, have all of the information that the toolchain does, lccc does not try to pull a fast one, by laying things out in a way the abi version doesn't allow. If you read the crate manifest of something compiled, read the abi_version field of the crate header, and see 0, 1, 42, or 65535, you can know that you are able to pull the documentation on that version and the compiler does what that version prescribes, instead of it actually referring to a completely different series of documents with a completely different numbering scheme. This goes to a philosophy of mine that the user (of the compiler) is no different from any other part of the toolchain, and thus can rely on the same guarantees that the toolchain relies on, provided the same information. If you don't have the version, you can't rely on it just because you use a version of the lccc rust frontend, but if you know the version it is WYSIWYG.

I think the easier solution is to say that it is unspecified, except for a list of guarantees -- just like what we do with layout.

I think this is probably a difference in terminology. I read "unspecified" as "nondeterinistic choice of the AM" and "implementation-defined" as "parameter to instantiate the AM". Although I suppose with dynamic linking being the way it is on ELF platforms, nondeterministic is more correct.

RalfJung commented 1 year ago

The point with implementation-defined is that the implementation defines what it does -- as in, puts out a document describing this. At least that is my understanding of the distinction between implementation-defined and unspecified.

chorman0773 commented 1 year ago

The point with implementation-defined is that the implementation defines what it does -- as in, puts out a document describing this. At least that is my understanding of the distinction between implementation-defined and unspecified.

That is part of that, but, at least from my C++ background (Which is far better formulated than the C Standard, if imperfect) the distinction is also between parameters vs. non-deterministic choices. And indeed [intro.compliance] notes that what the implementation documents is the instance of the abstract machine it implements, not specifically it's choices for the parameters introduced by implementation-defined constructs.