Open andrewrk opened 5 years ago
This could be done with "butterfly" data before the pointed address with allocator's help.
memory layout: v
is pointer address
v
other data before user data here
This is used in V8 (JS runtime) to allow fast indirection with attached metadata (infrequently accessed).
Basically, you store type info, undefined-ness to the left side of the butterfly
@intToPtr
Interop with C may break
What if the user only define a struct partially? Have undefined-bit for each field?
Existing @ptrCast
with C union will break (since type info doesn't match)
It's better to have generation + memory address. (to prevent use-after-free with memory reuse).
What if I want to abuse the pointer casting to temporarily cast to a wrong type (for efficiency sake)?
E.g. imagine a sentinel-based doubly linked list (like intrusive lists in boost). Something like
const Item = struct {
data1: Data1,
data2: Data2,
link: Link,
};
const Link = struct {
next: *Item,
prev: *Item,
};
const List = struct {
sentinel: Link,
};
(in reality List
and Link
would be generic structs of course).
Instead of setting next
and prev
pointers to null at the ends of the list (as Zig's std
implementation does), they would point to the sentinel (the main benefit is that this avoids branching in list modification operations, compared to using nulls).
However, the sentinel is only a Link
, but not an Item
. Strictly speaking, next
and prev
must have type *Link
in order to be able to point to the sentinel, but that would generate extra unnecessary pointer arithmetic upon list iteration and traversal, since one would need to convert from *Link
to *Item
in order to return *Item
to the caller. So, it is potentially more efficient to pretend that sentinel
is a part of a larger imagined Item
object. This also drastically simplifies the list inspection in a debugger, as one can easily see the entire contents of the list items (instead of only the link data) by following the pointers.
This proposal seems to invalidate the respective implementation. The "workaround" function @ptrCastUndef
also doesn't help here at all. Would one need to give up on tricks like that?
Edit: actually maybe one doesn't need @ptrCast
here, the implementation mostly woudl rely on @field
and @fieldParentPtr
, not sure if those two would be also subject to runtime checks. One might however need to use allowzero
pointers for Items
, as the formal Item
pointer obtained for the sentinel might be zero, in which case some @ptrCast
s would be necessary.
Also what if I want to cast without knowing in advance, whether the memory is undefined or already contains precious data? What if I want a pointer to memory containing garbage (e.g. returned by an allocator), which is not equal to undefined
?
I'm excited about this one. This connects a lot of dots and is part of the unofficial Make The Safe Build Modes More Safe project (#2301).
Here are some of the features of Zig this depends on:
undefined
is OK)The proposal is to add a secret safety field to types which have no well-defined in-memory layout, similar to how unions have a secret safety tag field. The secret safety field has an integer which denotes the type id. A unique integer id will be generated for every type across an entire compilation.
Next, augment the rules about undefined values (see #1947) with this: in safe build modes, the bit pattern of
undefined
shall be0xaa
(repeating) across the store size of the type and for types which have no well-defined in-memory layout, the bit pattern0xaa
repeated across the store size shall not match a valid state.This makes it possible to add safety checks to
@ptrCast
,@intToPtr
, and@fieldParentPtr
. It will be detectable illegal behavior (see #2402) if the actual element type does not match the target type specified in the cast, or if the memory has an undefined value.Sometimes it is desired to
@ptrCast
or@intToPtr
when you know the memory is undefined. For these cases we introduce@ptrCastUndef
and@intToPtrUndef
which simultaneously cast and assignundefined
to the memory. These functions allow the programmer to change the type of memory in a legal way.