Open SpexGuy opened 3 years ago
When implementing this, note that Clang's way to handle something like (intptr_t)&x + 1
is to replace the cast with a char*
cast and then do a ptrtoint cast at the end:
ptrtoint (i8* getelementptr (i8, i8* bitcast (i32* @x to i8*), i64 1) to i64)
This feels like different types of memory address.
Something like *virtual T
would signify this?
This issue contains a description of how memory and pointers should work when aliased at compile time. As a task, this issue can be closed once stage 2 implements this behavior.
Definitions
Aggregate Types
Types in the following families are considered "Aggregate Types":
All other types are not Aggregate Types. If a type is not an Aggregate Type, then it is a Primitive Type.
Defined Representation
The following types have Defined Representation:
For any given target, instances of these types have a well-defined layout in memory. They can be passed between compilation units safely, without fear of misinterpretation. Additionally, their layout is known and well-defined at compile time. This list may look familiar. It's also the list of values which are allowed in an
extern struct
.The following types do not have Defined Representation:
Top Level Variable
A "Top Level Variable" is a variable which is not part of something larger. In compiler parlance, it has its own provenance. The following are top level variables:
Everything else is not a top level variable:
Memory Islands
As described in #7770, memory at compile time is split up into separate "islands". A pointer to one island can only ever point to data in that same island. There are two types of memory islands: Literal and Virtual. Virtual memory islands may have child islands, which can be Literal or Virtual. Literal memory islands have Defined Representation.
Memory Islands are arranged into trees. Each Top Level Variable creates one tree based on its type. The Memory Island Tree for a type is constructed as follows:
If the type has Defined Representation, use a Literal memory island matching the size of the type. Otherwise, use a Virtual memory island. If the type is an Aggregate Type, add child islands for each member using this algorithm.
Let's look at some examples to make this clearer:
Virtual Islands
Virtual islands are highly constrained. Pointers to virtual islands may be reinterpreted and added to at compile time, but attempting to read from a virtual island pointer that has been offset is a compile error.
When offsetting a pointer to a Virtual island, only the following values are allowed:
Attempting to offset a pointer to a Virtual island with any other value (e.g. offset of a field in an unrelated struct, or size of an unrelated frame) is a compile error.
@fieldParentPtr
can be used to obtain the parent island. If@fieldParentPtr
is used to obtain a pointer to a virtual island, and the actual parent is nonexistent or has a different type, it triggers an immediate compile error.Literal Islands
In contrast, Literal islands behave like runtime memory, and can be fully type punned. You can reinterpret the memory as you see fit.
When offsetting a pointer to a Literal island, only the following values are allowed:
Comptime memory may contain undefined bits. Undefined can be specified at bit granularity, using packed structs. However, this can only be done in memory itself. If you attempt to load a primitive value (like u32), the entire value is undefined if any bit is undefined. For example:
An additional consideration is that some values (like pointers and offsets) have Defined Representation, but do not have fully comptime known values. I'll refer to these as Deferred values. In many ways, Deferred values behave like undefined at compile time. You can't do math on them or otherwise inspect their value, but you can copy them from place to place.
Let's consider an example:
For a 64 bit target, the literal island backing a comptime-known
Example
instance might look like this:By using type punning with a packed struct, I could overwrite some bits in the middle of this region:
This operation has created one byte that is partially undefined, and split the lazy value into two pieces. This ability to partially overwrite a deferred value has some pretty cool benefits:
Implementation
This section describes a simple but memory inefficient implementation of the above, to show that all the offset rules can be implemented without too much work. Contributors please note that the actual implementation probably will (and should) look very different.
The data structure described above can implement all parts of this proposal, but is not very efficient. An important optimization here comes from realizing that the expensive cases (bit spans, type punning) are actually exceedingly rare in comptime code. We can avoid a lot of work by using more optimized representations for more common cases. For example, we can keep all memory islands virtual up until a write through a type punned pointer is performed. At that point, the island needs to be converted to a literal region. But if the memory is never type punned, it can remain in a faster format for its whole lifetime.
Open Questions
Linking Very Late Values
Pointers are sometimes resolved when the program is loaded, and never known by any part of the compiler. In order to have lazy bit slices of deferred pointer values, we need a way for the program loader to set this up. We need to make sure all loaders can do this.
An alternative, if some loaders can't, is to generate a simplified start code stub. This stub would only copy data, and would not be user programmable. Since it's entirely understood by the compiler, the compiler can check for circular references and issue a compile error in those cases.
Offset Order Dependency
As specified above, there's a peculiar ordering needed for field offsets. If I have struct A contains B contains C, and I take an int pointer to a. I can offset that pointer by the offset of B, and then the offset of C, to get a pointer to the innermost struct. But if I offset by the offset of C first, that's a compile error because the compiler doesn't see how C could be related to A. This is true even if I intend to offset by B next. This might be fine, since adding offsets to each other at compile time is an error. But there might be a way to represent this properly without infinitely expanding memory. If so, a follow-up proposal for this is welcome.