Julep: setfield! for mutable references to immutables

vtjnash commented 8 years ago

We can only call setfield! on a mutable, so calling it on a immutable has been an error. This makes it hard to efficiently construct immutable objects incrementally. To fix, we propose making it possible to have setfield! modify fields inside of immutable objects that are wrapped in mutable objects. As will be shown later, this wouldn't alter existing semantics. This proposal also is an implementation of the concept that the object nursery is editable and then the object should be come immutable when done constructing it.

To support this proposal, the setfield! function will get a multi-arg form, with the following behaviors: setfield!(x, a, b, c, value) mutates the right most mutable object to change the value of its fields to be equivalent to copying the immutable objects and updating the referenced field.

Then in the front-end, we will lower the syntax form x.a.b.c = value to the multi-arg setfield! (instead of the current lowering that mixes setfield! and getfield). In the mutable case, this will not change any behavior since it still assigns to the right-most field. In the immutable case, it would now be possible to assign through a mutable object, resulting in a transparent/fused copy/assignment that maintains the semantics of both the update of a mutable field with a new copy, and the fixedness of the immutable value. This lowering change has precedent, since it would be semantically similar to the way that setindex! is handled. Oscar tells me that Inference already has the logic to ensure this lowers efficiently to avoid allocating extra values, and codegen will also be able to handle this efficiently.

tl;dr The syntax:

x.a.b.c = 3

would now be valid, as long as at least one of the referenced fields is mutable.

(opened at the insistence of @carnaval, to summarize recent discussions)

yuyichao commented 8 years ago

https://github.com/JuliaLang/julia/issues/11902 ?

carnaval commented 8 years ago

yep, just more general. could generalize to arrays if we folded get/setfield and arrayref/set

Keno commented 8 years ago

Yes, this is #11902 + extra syntax. Do note the discussion on constraints currently enforced by immutable constructors though.

Keno commented 8 years ago

Of course since that was posted, I've pretty much come around that there is a distinct difference between regular immutables and those that enforce extra constraints (e.g. Rational). I think it would be fine to allow this in general and have a separate syntax (@sealed immutable) to forbid this and only allow construction by the constructor.

yuyichao commented 8 years ago

For the purpose of atomic operation, I was thinking about generalizing the Ref type to do this. The setfield/arrayset/atomics can be implemented using intrinsics/builtins that operates on the pointer to the slot and the owner object (for wb).

vtjnash commented 8 years ago

Note also that unlike #11902, this doesn't use setindex!, but instead takes advantage that accessing the fields of an object through . is undefined behavior for program semantics

StefanKarpinski commented 8 years ago

Nice. a[i].x = v would also be allowable when a is indexable and mutable (e.g. a vector), as would a.b[i] = v even when a.b is something indexable but immutable like a tuple, as long as a is mutable. It's a little harder to know what the lowering should be in those cases.

yuyichao commented 8 years ago

That's exactly why I was thinking about using a Ref to do this.

E.g. a[i].x = v could be rewritten to store!(RefField(Ref(a, i), :x), v). This might be a much bigger change though so it would be good to have the a.b.c = x case handled first.

toivoh commented 8 years ago

I like this idea, but would like to note that it makes immutables behave semantically different than mutables, not just being more restricted as they are right now. Example: assign the same immutable to two fields, then modify it through one of them.

toivoh commented 8 years ago

Would immutables bound to non-const local variables be eligible for mutation as well?

rfourquet commented 8 years ago

@toivoh Probably not according to this question and Jeff's answer below it.

toivoh commented 8 years ago

Well, I would say that Jeffs answer (quoted from #5333)

Yes, we dislike that idea since it means x.baz = 3 may or may not be a mutating operation depending on the type of x. That code will silently do subtly different things for different types of x.

applies equally the whole idea of this Julep, not just the case of immutables stored in local variables?

toivoh commented 8 years ago

To get rid of the subtly different behavior depending on the runtime type of x, we could require to mark the thing (semantically) being mutated with special syntax. E.g.

x (.a.b.c)= 3  # x, a and b are immutable; the binding for x is changed
x.a (.b.c)= 3  # a and b are immutable; x.a is changed
x.a.b (.c)= 3  # b is immutable; x.a.b is changed

This could be parsed into

x = setfield(x, :a, :b, :c, 3)  # x, a and b are immutable; the binding for x is changed
setfield!(x, :a, :b, :c, 3)     # a and b are immutable; x.a is changed
setfield!(x.a, :b, :c, 3)       # b is immutable; x.a.b is changed

etc. where setfield! would always mutate its first argument directly, and not recurse into mutable fields. This would be semantically equivalent to

x = setfield(x, :a, :b, :c, 3)  # x, a and b are immutable; the binding for x is changed
x.a = setfield(x.a, :b, :c, 3)  # a and b are immutable; x.a is changed
x.a.b = setfield(x.a.b, :c, 3)  # b is immutable; x.a.b is changed

where setfield (without trailing !) would return a copy of the immutable with the given field changed.

It's not so easy to find an appealing syntax for this though, which is lightweight, makes it clear what happens, and is available.

Another thing that the example highlights is that the proposal so far is a bit like adding += without introducing the + operator. Should there be a syntax for what I call setfield(x, :a, :b, :c, 3) above, i.e. copying x and changing a (sub-) field? Could it be made consistent with the mutating syntax, in a similar manner that += is consistent with +?

andyferris commented 8 years ago

I like this idea a lot!

Couldn't immutables call their constructor whenever they are copied to the stack (i.e. directly assigned to a variable, as an immutable)? One simple way would be to have the user define a constructor from the Ref of the type:

"`Pos` stores a positive number"
immutable Pos{T}
    val::T
    Pos(x) = x >= 0 ? new(x) : error("must be positive")
    Pos(r::RefValue{Pos{T}}) = r[].val >= 0 ? new(r[].val) : error("must be positive")
end

There would be a default, costless constructor. If the user defines any constructor, and they want to enable copying from the heap (where someone might have used a pointer to mutate the values) then they would need to have explicitly defined such a constructor.

That way it is free in the usual case, and invariants are still defined when the variable is a "direct" immutable (or a field of an immutable, etc). Users could add an extra "unsafe" constuctor, for efficiency in certain circumstances. Trying to worry about if it follows invariants on the heap seems utterly hopeless, but e.g. whenever f(p::Pos) is called, then f can be sure that the invariant is held true.

stevengj commented 8 years ago

@toivoh, I don't think Jeff's comment from #5333 applies here. x.y.z = 7 would always be a mutating operation in this proposal. It may not be defined if x.y is not a mutable reference, but that's no different from saying that foo(x) may not be defined depending on the type of x.

toivoh commented 8 years ago

I think that Jeff's comment still applies. Even if we restrict ourselves to the cases when x.y.z = 7 is a mutating operation, it would mutate different things depending on the circumstances: x or x.y.

The effect is very similar to the one that Jeff talks about in the comment: in some cases the mutation will be visible in another variable that was eg previously initialized as v = x.y, and in others (when x.y is immutable) it will not.

StefanKarpinski commented 8 years ago

@toivoh: I think your point is fairly subtle (I'm having trouble following it) and would be clearer if you can spell out a case where the difference in behavior is externally visible.

yuyichao commented 8 years ago

I think the point is that

v = x.y
x.y.z = t

May or may not mutate v. I personally don't feel like this is a big issue though. The same thing can already be achieved with today's model, just much less efficient.

vtjnash commented 8 years ago

Right. Another equivalent case is:

x = y # this a (lazy) copy for immutables but a duplicate reference for mutables
y.z += 1 # the difference in `=` above is reflected in, or because of, the behavior of this
# did x get modified? did y get modified? did it throw an error?

JuliaLang / julia

Julep: setfield! for mutable references to immutables #17115