Cranelift: make CLIF behavior platform-independent w.r.t. endianness

cfallin commented 3 years ago

Currently, CLIF has three kinds of endianness for memory loads and stores: big, little, and native. The meaning of a native-endian operation depends on the platform on which the CLIF executes.

The purpose of this three-option design, as we discussed in #2124, was to allow for convenience at the CLIF producer side: loads and stores that are meant to access platform-native values (such as pointers in a stack frame or data passed to and from code produced by other compilers) can simply use the "native" option, and the CLIF becomes parametric to endianness, working correctly on platforms of both endians.

It appears that, in the discussion in #2124, we initially (comment, comment) were leaning toward a strict two-option (big/little), always-explicit endianness flag on memory ops, but then it became apparent that this would require some more plumbing to know the endianness upfront.

The new forcing function that we have, however, is the CLIF interpreter. Because we now have an interpreter that is platform-independent, it becomes important to define what result a given CLIF execution should provide. It seems very important that this should be the same result regardless of the platform we happen to be running on. Otherwise, if a CLIF program can have multiple results depending on platform, then many other endianness issues could occur at higher levels of the system.

In essence, we're late-binding endianness, after the CLIF is produced. In contrast, other compilers, such as LLVM, use a form of early-binding: e.g., the data layout that is a part of a program in LLVM IR specifies the endianness assumed by the IR.

In this issue I'm suggesting that we consider doing the same: it would provide well-defined CLIF semantics, and shouldn't impact the ergonomics of most CLIF producers, requiring a bit more info when creating a builder (target platform) but then using the target's native endianness where "native" would have been used before.

One alternative is to disallow (i.e., declare to be undefined behavior) any CLIF that has a native-endian load/store interact with another access in a way that exposes endian-dependent behavior, but that seems much more problematic, because many real programs do this (e.g., Rust compiled via cg_clif can perfectly legally store a u32 to memory and load its first byte). Another alternative is to bias the interpreter toward one endianness or another (e.g., the interpreter always behaves like a little-endian machine), but then the results differ between interpretation and native execution on opposite-endianness machines (e.g. big-endian), which is also undesirable.

This is a continuation of the discussion in #3329; cc @uweigand @afonso360 @fitzgen and others. Thoughts?

cfallin commented 3 years ago

To make the proposal a bit more concrete, this would involve two changes:

Require either a specific target, or at least an endianness, when creating a function or instruction builder. (Probably "full target" rather than "endianness" as the latter is a low-level detail to most users; and probably when creating the function rather than a particular builder.)
Make the load/store metadata support only the two endianness options.

fitzgen commented 3 years ago

Those two changes sound great to me! :+1:

bjorn3 commented 3 years ago

The clif ir has to be target dependent one way or another as you have to use the right pointer size.

uweigand commented 3 years ago

@cfallin maybe we can both make architecture features like byte order (or pointer size?) explicit in the IR and reduce the amount of changes to be introduced, by declaring global architecture properties just once in the IR, along the lines of how LLVM IR has a datalayout statement just once per file?

cfallin commented 3 years ago

@bjorn3:

The clif ir has to be target dependent one way or another as you have to use the right pointer size.

Yes, that would probably be a part of the information too, like LLVM's DataLayout. (That said, varying pointer width is a different sort of nondeterminism concern than endianness because while changing endianness directly alters the semantics of loads/stores, changing pointer width just means that code with baked-in 32-bit-layout assumptions may overflow; but the semantics of each individual instruction are still well-defined. So from a "can the interpreter arrive at the one correct answer according to the semantics" perspective, it's not quite the same.)

@uweigand:

declaring global architecture properties just once in the IR, along the lines of how LLVM IR has a datalayout statement just once per file?

Maybe, though I do like the aspect of CLIF that all attributes are per-function currently (which I suppose arose from the parallel-compilation-compatible design of keeping all IR data per function). Purely to keep to that principle I think it might be simpler to have a big_endian / little_endian attribute and a pointer32 / pointer64 attribute (among others?) on the function itself.

bytecodealliance / wasmtime

Cranelift: make CLIF behavior platform-independent w.r.t. endianness #3369