Open SpexGuy opened 3 years ago
From your description I get the pinned
attribute propagates "upwards" trough the container types, I guess the same is valid for optional types (even though those are not containers).
One more use-case to take into account, given a pinned value y: ?T
doing if (y) |x|
becomes a compile error due to the implicit copy into x
.
I like this addition a lot. Knowing that some structs will stay where I put them is going to be great in a lot of places.
However, I must note that this addition contributes to the vaugeness of the =
operator. There is now an even wider distinction between computing a value before assigning it versus assigning it before computing it, and knowledge thereof is becomming more and more neccesary to properly understand what's going on. I'm not sure if this is enough to warrant a new syntax, but I think that we should at the very least make a clear and logical distinction between the two in the docs that applies whenever this is revelant.
Should "pinned" be an attribute on the type or an attribute on the field?
Would we really a new CV-qualifier to enable per-value pinning? It seems that can be easily covered if pinned
was not just a modifier on anonymous types, but an operator on any applicable type:
const AlwaysPinned = pinned struct {
a: usize
};
const MaybePinned = struct {
a: usize
};
var x: AlwaysPinned;
var y: MaybePinned;
var z: pinned MaybePinned; // pinned just like x
attribute on the field
This would require CV qualifiers. Here's the case that is protected on types but not on fields:
fn foo(a: Allocator) void;
foo(std.heap.page_allocator.*);
type operator
What is the relationship between the value types T
and pinned T
? Coercion requires a copy, so you could add pinned but not remove it. Similarly with pointers, we would allow coercions that add pinned but not those that remove it. But now, under this system, if you ever use the type *Allocator
, you have introduced a bug. Your library is now incompatible with most allocators, because they tend to be pinned. Being pinned is part of the Allocator interface, even though some implementations may not need it.
Your library is now incompatible with most allocators, because they tend to be pinned.
For convenience, the standard library should of course make definitions such as pub const Allocator = pinned MovableAllocator
. Still no reason for qualifiers, as far as I understand.
given a pinned value
y: ?T
doingif (y) |x|
becomes a compile error due to the implicit copy intox
.
Couldn't x
just alias y
(if T is a pointer) or y.value
(otherwise)?
If I may suggest, my two cents ...
I think that the pinned
keyword fits well from the "memory" perspective; From the "programmer" perspective, something like ref
or byref
would be more self-explanatory.
Example:
//introduced pinned modifier on struct
const MyStruct = pinned struct {
//...
};
I think that this special kind of struct
should behave just like a regular one, except it cannot be copied; It works much like opaque
types that are allowed only as a reference, except that for pinned struct
it's ok to have its value stored somewhere during the initialization.
Example:
// Wrong, declaring by value must result in compile error
fn doSomething(arg: MyStruct) void {}
// Wrong, returning by value must result in compile error
fn returnSomething() MyStruct {}
// Corret, allocating on the heap, returning by reference
fn init(allocator: *Allocator) *MyStruct {
var ret = allocator.create(MyStruct);
ret.* = .{
// initialization ...
};
return ret;
}
const MyContainer = struct {
// It is ok to hold a pinned struct's value as a field
child: MyStruct,
};
fn main() void {
// parent holds a pinned field
var parent = MyContainer { ... };
// Correct, must be by ref
var child = &parent.child;
// Wrong, trying to copy, compile error
var bad = parent.child;
// Corret, storing on the stack, but referring as a pointer
// this pointer will not live after the function's end.
var myValue = &MyStruct {
// initialization ...
};
}
Question:
Should a parent struct holding a pinned struct
be pinned
too?
If yes, should it be implicit pinned (like in Rust), or have a compile error if it is not marked as pinned?
Example:
// This struct must be pinned too?
const MyContainer = struct {
// Holds a pinned struct
child: MyStruct,
};
// It's ok for this struct not be pinned
const AnotherContainer = struct {
// Holds just a reference
child: *MyStruct,
};
And finally, the built-in function @fieldParentPtr
must accept only pinned structs
as an argument, avoiding invalid references.
pub const Allocator = pinned MovableAllocator
Sure, that makes sense, but what is the use case for MovableAllocator? Why would anyone ever need to use it? Why does it need to coerce to Allocator? Introducing a modifier on all types (including pointers and others) is a pretty invasive change, arguably more complicated than adding a new CV qualifier. It needs solid use cases to back it up.
If I may suggest, my two cents ...
This is pretty much the behavior as described above. Pinned is specifically about memory, not the programmer perspective, so IMO that name fits better. There is one example that's different though:
// Wrong, returning by value must result in compile error
fn returnSomething() MyStruct {}
Zig uses result location semantics (#287, #2765, #2761) for struct return values, so this does not generate a copy and is allowed. Similarly, the semantic model for value parameters allows the compiler to pass them as pointers. We may consider in the future an extension that forces the compiler to pass pinned structs as constrained pointers, which would allow that example to compile as well.
Should a parent struct holding a
pinned struct
bepinned
too?
Yes, as noted in the proposal, structs containing pinned fields are implicitly also pinned.
the built-in function
@fieldParentPtr
must accept only pinned structs as an argument
I'm not convinced of this, I think there are valid uses for @fieldParentPtr
that do not require pinned types.
what is the use case for MovableAllocator?
as you said,
some implementations may not need [to be pinned]
Otherwise, it would not need to be public.
Introducing a modifier on all types [is] arguably more complicated than adding a new CV qualifier
I don't know very much about the compiler internals, but there seem to be around 70 base types and three qualifiers, which led me to the assumption that the former set would be easier to extend.
A disadvantage of making pinned
a type would be that nested pointers could not be coerced; I do not know whether this matters in practical use cases.
Just reread this proposal. One thing kind of jumps out from the motivation section:
A common pattern in Zig is to embed a struct in a larger struct, pass pointers to the inner struct around, and then use @fieldParentPtr to retrieve the outer struct. This is central to our current favored approach to runtime polymorphism. However, it's easy to accidentally make a value copy of inner structs like this, causing hard to debug memory corruption problems or segfaults.
A similar problem exists with types that may contain pointers to data within themselves. As an example, consider an array with an inline stack buffer. Copying this value would yield a broken object, with a pointer pointing into the previous instance.
Finally, at the extreme end of this is async frames. Async frames may contain pointers into themselves, and so cannot safely be copied.
I highlighted the parts that I, finally, noticed. From the programmer perspective, this is about copying values. Shouldn't it be called no_copy
or something similar? This seems a lot simpler to understand. Then all the rest seems to fall out naturally. If the compiler can prove RLS will work, then it is not a copy.
The capture case that @LemonBoy brought up is a good one.
It is a minor nit, but it seems to make it more clear what you are trying to do: stop anyone/anything from making a copy. What am I missing here? Is there some other aspect to pinned
that will be used later?
While I agree that nocopy
is a better description of how the attribute behaves, I think it conveys more poorly where this attribute should be used. For example, looking at the name alone, you may be tempted to put the nocopy
attribute on a large struct, because copying it may be expensive. Or you might want to put it on a data structure which in other languages would be move-only. However this would be a misuse of the feature. Personally I think pinned
better conveys the actual intent of the feature - it should only be used when the address of an instance is a meaningful part of its value.
However I do agree that this is not a clear-cut decision, I think @andrewrk will need to weigh in to decide this.
pinned
-> do not move something which is meaningful in Rust. But does not really say anything about copying in the common usage of the term.
no copy
-> clearly means do not/can not copy. But does not really say anything about moving (which would also need to be forbidden).
These seem orthogonal to me.
But this is devolving into bikeshedding so I will stop on that topic.
The proposal as a whole seems fairly good. I think there are a few things that are not clear to me or seem to make this a bit trickier than apparent:
pinned
be required for any type that includes another pinned type would be useful for later code reading. That helps with the above problem.One of the things I really like about Zig's philosophy is that the intent is to guide the programmer into doing the right thing without forcing guard rails. In this case, it makes me wonder if the Allocator
interface is actually exposing a different problem: maybe it is not the right way to do things? For example, there is a proposal about partial structs, #7969, that would also allow similar ease of coding and use IIUC. It also solves a number of other problems as noted in the #7969 proposal. I am not saying that the other proposal should be done, but it does solve at least one of the motivating problems of this proposal.
While I agree that
nocopy
is a better description of how the attribute behaves, I think it conveys more poorly where this attribute should be used.
I agree with @SpexGuy: pinned
is less likely to be used by accident or with the wrong intentions.
Let me ask something:
Yes, as noted in the proposal, structs containing pinned fields are implicitly also pinned.
Why do we need to enforce pinned
across all hierarchy?
I mean, for example:
var gpa1 = std.heap.GeneralPurposeAllocator(.{}) {};
var allocator1 = &gpa1.allocator;
// OK here
var mem1 = allocator1.alloc(u8, 32);
// Copy the parent struct,
var gpa2 = gpa1;
var allocator2 = &gpa2.allocator;
// But it's OK, allocator2 still valid
var mem2 = allocator2.alloc(u8, 32);
// Wrong, copy the pinned type, segfault here!
var allocator3 = gpa1.allocator;
var mem3 = allocator3.alloc(u8, 32);
Why not leave to the programmer set pinned
just where needed?
I see that Rust has a very strict requirement for lifetimes checks, but I think that is not Zig's case.
Here's a quick reaction after reading through this: Since the polymorphism/@fieldParentPtr use case is the immediate problem, perhaps a pinned
modifier for struct fields only, would be sufficient as a starting point. That way the struct with the fn that uses @fieldParentPtr is also responsible for pinning the corresponding Allocator field, and different uses of Allocator would not be impacted. If I've missed something crucial about why this won't work, then please just ignore this and no need for a detailed reply.
@jumpnbrownweasel pinned memory area are also important when you store "up-pointers" from children to its parents or similar.
@jumpnbrownweasel pinned memory area are also important when you store "up-pointers" from children to its parents or similar.
Yes, I understand. I got the impression reading this that there are some unresolved and complex design issues for the general case, and I was just suggesting that a pinned modifier for fields is all that is necessary for the @fieldParentPtr case, if a partial solution might make sense initially.
EDIT, added: For a struct type that is pinned because it contains self-referencing pointers, it makes sense that containing structs would also be pinned. But this restriction isn't necessary when pinning is to prevent the problem with @fieldParentPtr. These seem to be two different problems, which is also a reason I thought that @fieldParentPtr could be solved with a field modifier instead.
What about types that are "conditionally" pinned? For example, ?T
is also pinned if T
is pinned, but it's always OK to copy a null
. Same applies to tagged union, where only some tag payloads are pinned. It may even be legal to copy a pinned type if the value is undefined
. So maybe we need something like @unpin()
that casts away the pin
qualifier as an escape hatch?
So maybe we need something like @unpin() that casts away the pin qualifier as an escape hatch?
Interesting. I'd extend the question to a union type where some of the cases are pinned and some aren't. Here's the behavior I would expect:
const Variant = union(enum) {
allocator: Allocator, // this is pinned
num: usize, // this is not pinned
// ...
};
var variant = Variant { ... };
const v2 = variant; // compile error, variant as a whole is pinned because allocator is pinned
const n = variant.num; // ok
const a = variant.allocator; // compile error, cannot copy a pinned Allocator
switch (variant) {
.allocator => ..., // adding a capture |allocator| would be a compile error because it's a copy, but |*allocator| is ok
.num => |x| ..., // this is ok
}
I don't really see any problematic cases here though. I also couldn't come up with a problematic case for ?T
. @Hadron67 are you able to come up with any?
@marler8997 A potential problematic use case I can come up with is when working with container that does reallocation, like ArrayList
or HashMap
. Since reallocation may involves copy so will invalidate previous elements, calling insertion function will give a compile error if the value is pinned. But it should be ok if you append null
, undefined
, or union tag with non-pin payload to the container until you initialized or modified any of the elements to pinned value. Although this can be solved with initCapacity
for simple containers such as ArrayList
or HashMap
, it's not the case for more complicated containers, like StringHashMap(StringHashMap(V))
.
However I actually can't think of a satisfying workaround for this case other than not marking the value type as pinned (Edit: can be partially solved using [@sizeOf(V)] align(@alignOf(V)) u8
), and I don't know how something like @unpin
could help. Given that C++ has similiar problems with e.g. std::vector<std::atomic<int>>
, I think this may be a tradeoff of struct pinning.
I feel like there is no good way to implement this, since one may want to copy the parent struct. It's probably better for interface types to contain a pointer to the concrete type.
const Named = struct {
inner: *const c_void,
getNameFn: fn (*const c_void) []const u8,
pub fn getName(self: @This()) []const u8 {
return self.getNameFn(self.inner);
}
};
const Dog = struct {
name: []const u8,
fn asNamed(self: *const Dog) Named {
return .{ .inner = @ptrCast(*const c_void, self), .getNameFn = Dog_Named_getName };
}
};
fn Dog_Named_getName(self: *const c_void) []const u8 {
return @ptrCast(*const Dog, @alignCast(@alignOf(Dog), self)).name;
}
pub fn main() !void {
const std = @import("std");
var obj = Dog{ .name = "woof" };
const named = obj.asNamed();
std.log.info("{s}", .{named.getName()});
}
It's probably better for interface types to contain a pointer to the concrete type.
I think so too, especially in light of the discovery (https://github.com/ziglang/zig/issues/10037#issuecomment-952205513) that this method seems to be friendlier to optimization (devirtualization). Not to mention the other advantage of this approach: that the implementation type doesn't need to embed or "know about" the interface.
Is it time to stop calling fieldParentPtr the "favored approach to runtime polymorphism"?
This would be useful when dealing with platform specific contexts (ie. ucontext_t
). For example, on macos:
pub const ucontext_t = extern struct {
onstack: c_int,
sigmask: sigset_t,
stack: stack_t,
link: ?*ucontext_t,
mcsize: u64,
mcontext: *mcontext_t,
__mcontext_data: mcontext_t,
};
It's a footgun since you can trivially copy the linux version of this structure, but on macos the mcontext
pointer has to point to wherever __mcontext_data
is. I discovered this when I accidentally relocated one of these structures, and was trampling over my stack elsewhere.
This proposal supersedes #3803 and #3804 as the plan going forward for pinned structs, as discussed in the design meetings.
Motivation
A common pattern in Zig is to embed a struct in a larger struct, pass pointers to the inner struct around, and then use
@fieldParentPtr
to retrieve the outer struct. This is central to our current favored approach to runtime polymorphism. However, it's easy to accidentally make a value copy of inner structs like this, causing hard to debug memory corruption problems or segfaults.A similar problem exists with types that may contain pointers to data within themselves. As an example, consider an array with an inline stack buffer. Copying this value would yield a broken object, with a pointer pointing into the previous instance.
Finally, at the extreme end of this is async frames. Async frames may contain pointers into themselves, and so cannot safely be copied.
Where possible, the compiler should give an error if you try to copy any of these types. This is also important for the current plan for Result Location Semantics (see #2765), which is very implicit. In the case of these types, it's extremely important to know if RLS is working (and to fail the compile if not).
Proposal
Introduce the
pinned
keyword. This keyword can appear as a modifier to an anonymous type declaration, similar topacked
andextern
.pinned
may coexist withpacked
orextern
, and is compatible withstruct
andunion
types. It is not allowed (or ever needed) onopaque
, because opaque values cannot be copied. It is also not allowed onenum
orerror
, because those types cannot contain self pointers.If a type is
pinned
, it means that the address of the type is part of its value, and the compiler is not allowed to copy an instance to a new location. But braced initialization is still allowed, and the address may be forwarded through result location semantics.Any composite type which contains a field which is pinned is also considered pinned, and the same restrictions apply. This includes error unions and optionals, so #2761 will also be important for working with these types.
Frame types will also be considered pinned.
It's important to note that, despite some superficial similarities,
pinned
is not related to RAII. Pinned will not enforce that your deinitialization code is run, nor will it prevent you from overwriting fields that needs to be cleaned up, or even whole structs with braced initializers. It only prevents you from accidentally relocating an existing object.Examples
Other Notes
Though we have chosen here to put
pinned
on type declarations, for some uses it is actually a property of values. For example, an allocator with no attached state would not need to be pinned. However, introducingpinned
as a new CV-qualifier seems like a much more invasive change, and it's not clear that it's worth it.