General call assignment

dgkf / R

An experimental reimagining of R

GNU General Public License v3.0

135 stars 6 forks source link

What is currently not supported, are call assignments such as environment(f) <- new.env(). There are two questions:

Do we want this at all?
How to implement this?

Regarding the first question, I think the syntax actually reads quite nicely and I think there is no reason to not keep this from R.

Regarding the second question, I think this needs a change to how mutability is handled. In the PR that implements value semantics for lists, the call_mut case is currently not handled for general function calls https://github.com/sebffischer/R/blob/3e3499117e5dd7b2c0a31084ec37fb0a94eff9bf/src/lang.rs#L950.

I think it might be nice if we could just assign to any function call on the lhs. I imagine this like below:

nas = fn(x) {
  x[is.na(x)]
}
x = c(NA, 1, 2, 3, NA)
nas(x) = -99
x
#> [1] -99 1 2 3 -99

However, this would (I believe) require some more changes to how mutability is handled. When we call nas(x) = -99, the lhs is evaluated mutably, which means that when nas evaluates its argument x, it receives a mutable view of the variable.

The resulting behavior can then be kind of weird, if the function that is called on the lhs of the assignment modifies its value in-place

add_99 = fn(x) {
  x[1:length(x)] = 99
  x[1]
}
x = 1:2
add_99(x) = 10
x
#> [1] 10 99

While one might put this off as a weird example, I still think we should protect from such weirdness.

The contract that the function that is called on the lhs should fullfill I think, is that it does not mutate any of the arguments that are part of its return value. (To simplify this in the beginning, we can maybe just say such functions are now allowed to mutate any of their arguments to avoid having to do the inference of how return values are related to arguments, which might be pretty tricky). To implemen this, we could annotate such arguments as being immutable, e.g. like below:

first = fn(x: immut) {
  x[1]
}
x = 1:2
first(x) = 10
x 
#> [1] 10 2

Maybe this would also serve as a good simple use-case to add simple notation to function arguments, which we might later extend to proper (optional) type annotations.

From an implementation perspective, I think all this needs is to add a mutable field to the ObjCow, which determines whether the object can be written to, which should be relatively straightforward.

On `R`'s `fn<-`

I've always found this call-assign feature of R to be super weird. I don't think it comes with any performance gains in R and adds a really weird api that requires you know precisely which functions provide this capability.

To use a call-assign function, you have to know

<- does not always get parsed as (<-, lhs, rhs)
When the left side of assignment is a function, it gets parsed as a new function (fn<-, ..)
Specifically, the signature gets restructured as (lhs<-, arg1, rhs, args2..n)
And the first argument to fn<- is "mutated" (automatically reassigned by the result)

For me this expects too many leaps of logic to be useful for what amounts to a shorthand for x <- fn(x, ...). At best it lumps two related getters and setters (ie names() vs names<-) together, and at worst changes the function's behavior (ie regmatches() vs regmatches<-)

On Generalized Call Assignment

Now, all of those concerns hold for situations where fn() and fn<- are distinct behaviors. You raise a very interesting case for fn<- behaviors that are exactly (<-, fn(), rhs) because fn() just returns a mutable reference.

As we've been exploring the mutable view idea, I've imagined it as something that users should not have to be concerned about. As far as users are concerned, all data is always copied and the mutable views only exist to avoid these copies when they're unnecessary.

But this fn<- context puts it in a slightly different light where we could consider the lhs of assignment as opting-in to mutation.

On `fn<-(x)` vs `x[fn()] <-`

Before committing to different fundamental behaviors for different sides of assignment, I'd like to see some examples where this provides some compelling use cases that couldn't be implemented any other way. In all examples that I can think of, leaving mutability to only indexing operators still seems sufficient - it just means re-writing those examples to return indices instead of a mutable reference:

nas = fn(x) x[is.na(x)]
nas(x) = -99

# could be easily rewritten as
nas = fn(x) is.na(x)
x[nas(x)] <- -99

This example is a bit trivial, but if it was some more sophisticated way of constructing a view of x, is there a situation where you couldn't just tuck the function into the indexing operator to produce that view?

Nothing is jumping to mind for me.

dgkf / R