Metaprogramming bounds syntax

dgkf commented 11 months ago

One thing that has been a central theme of Jan Vitek's work is that R's meta-programming facilities come at a pretty steep performance cost, even though they are used in a small subset of functions. Even though performance isn't a goal, I think the future of an R-alike language should probably consider these ideas and I'm interested in mocking them up.

Argument Passing

R's defaults are quite nice. Arguments are passed as promises that aren't actually evaluated until they're needed. Even before they're evaluated, their expressions can be rearranged and evaluated in different contexts. This is the central feature of R's meta-programming, but it means that most functions carry forward the machinery for meta-programming even if it would have little impact had the arguments all been eagerly evaluated.

For this purpose, I'm considering a default of eager evaluation, with a syntax for lazy evaluation. The exact syntax is very much up for debate, but the crux is that individual arguments can be flagged as lazy:

Example using . "context" syntax
not_null_else <- function(a, .b) {
  # a eagerly evaluated in parent frame
  if (!is.null(a)) a
  else b  # b not evaluated until here as it is just a promise
}

not_null_else(
  loot_chest(1, 2, 3, 4, 5), 
  stop("that password was incorrect")
)
Here the . syntax is borrowed from this proposal which uses the . to mean something like "in this context". Although not a direct mapping of the concept, it evokes a sense of contextual ambiguity at the interface of the calling frame and evaluation frame.

This would also put nice bounds on when tail calls are permitted. When a recursive function requires lazily evaluated arguments a standard evaluation model can be used, while functions that take all eager arguments can leverage tail call optimizations.

Declaring a `static` function

[!NOTE]
Feedback needed: What is the right name for this behavior?

A static function could be an even more restrictive constraint on a function which states that a function

only uses parameters and variables defined in its current scope
does not evaluate any expressions in other environments (including promises in other environments)
only calls out to other static functions

This would allow for much more intensive and useful static code analysis and optimization. I'm a long ways off from even considering such ambitions, but I'd like to get the conversation started on whether it would be worth the cognitive overhead. This is intended to address the closing thought of Jan Vitek's R melts brains.

Example using `static` keyword

f <- static function(n, if_even, if_odd) {
  if (n > 0) f(n - 2, if_even, if_odd)
  else if (n == 0) if_even
  else if_odd
}

sebffischer commented 6 months ago

lazy evaluation can also cause some undesireable behavior when the promises depend on some global state that changes between promise creation and evaluation:

f = function(x) {
  force(x)
  set.seed(1)
  x + runif(1)
}

g = function(x) {
  set.seed(1)
  x + runif(1)
}

set.seed(2)
f(rnorm(1))
#> [1] -0.6314059
set.seed(2)
g(rnorm(1))
#> [1] -0.05360045

^{Created on 2024-03-14 with reprex v2.0.2}

sebffischer commented 6 months ago

This can even cause code after a set.seed() to be indeterministic as the .Random.seed is modified after evaluating the promise:

f = function(x) {
  set.seed(1)
  x
  runif(1)
}

f(rnorm(1))
#> [1] 0.5728534
f(rnorm(2))
#> [1] 0.2016819

^{Created on 2024-03-14 with reprex v2.0.2}

dgkf / R