Use named with dotty - Githubissues

JosiahParry commented 2 months ago

I would really love to be able to access an object's names while destructuring. I was hoping that by using names(value) i could accomplish this due to the assignment method from dotty, but it isn't yet possible.

Here is a repro use case:

library(dotty)

env_vars <- list(
  "api_key" = "fbivyb137294hgwv",
  "hello" = "world",
  "current_user" = "josiah@parry.com"
)
for (i in seq_along(env_vars)) {
  .[k = names(value), v] <- env_vars[i]
}

In an ideal world it could look something like

for (.[k, v] in env_vars) {
   ...
}

kevinushey commented 2 months ago

I think the best solution in these scenarios is a helper like enumerate:

enumerate <- function(x, f, ..., FUN.VALUE = NULL) {

  n <- names(x)
  idx <- `names<-`(seq_along(x), n)
  callback <- function(i) f(n[[i]], x[[i]], ...)

  if (is.environment(x))
    x <- as.list(x, all.names = TRUE)

  if (is.null(FUN.VALUE))
    lapply(idx, callback)
  else
    vapply(idx, callback, FUN.VALUE = FUN.VALUE)

}

The main downside being that you lose out on the control flow options normally available within a for loop.

What would you think of an alternate syntax like the following?

dotty_enum(env_vars, .[key, value] <- {
  # do something; behavior like a for loop
})

The syntax is a little magical looking, but it's relatively lightweight, and the magic helps make it clear that this isn't a "regular" function invocation. This seems to work as expected:

env_vars <- list(
  "api_key" = "fbivyb137294hgwv",
  "hello" = "world",
  "current_user" = "josiah@parry.com"
)

dotty_enum <- function(values, dotty) {

  # Make sure we received a dotty assignment call.
  dotty <- substitute(dotty)
  ok <-
    identical(dotty[[1L]], as.symbol("<-")) &&
    is.call(dotty[[2L]]) &&
    identical(dotty[[2L]][[1L]], as.symbol("[")) &&
    identical(dotty[[2L]][[2L]], as.symbol("."))

  if (!ok)
    stop("expected a dotty expression of the form `.[key, value] <- {}`")

  # Pull out the dotty variables and dotty expression.
  keyvar <- as.character(dotty[[2L]][[3L]])
  valvar <- as.character(dotty[[2L]][[4L]])
  expr <- dotty[[3L]]

  # Create an expression that we can evaluate in the parent frame.
  # Since we're creating variables, we'll need to be careful about
  # cleanup. We do this so that control flow constructs work as expected.
  data <- list(
    ..values.. = substitute(values),
    ..keyvar.. = keyvar,
    ..valvar.. = valvar,
    ..expr..   = expr
  )

  expr <- substitute(env = data, {

    for (..i.. in seq_along(..values..)) {
      assign(..keyvar.., names(..values..)[[..i..]])
      assign(..valvar.., ..values..[[..i..]])
      evalq(..expr..)
    }

  })

  eval(expr, envir = parent.frame())

}

dotty_enum(env_vars, .[key, value] <- {
  str(list(key, value))
  if (key == "hello")
    break
})

That gives the expected:

> dotty_enum(env_vars, .[key, value] <- {
+   str(list(key, value))
+   if (key == "hello")
+     break
+ })
List of 2
 $ : chr "api_key"
 $ : chr "fbivyb137294hgwv"
List of 2
 $ : chr "hello"
 $ : chr "world"

kevinushey commented 2 months ago

Still, the question is whether this is more useful than purrr::imap -- the main benefit with this approach is the for-loop semantics + ability to work directly in the current frame?

JosiahParry commented 2 months ago

This is better than nothing, I agree. For one, purrr is a rather heavy dependency whereas dotty is a tiny base R dep.

There are also some cases where I feel that the ergonomics of a for loops are just better—particularly when running code for their side effects.

pkg_size_recursive("purrr")
#>          pkg    size
#> 1      total  15.27M
#> 2      vctrs    3.9M
#> 3      rlang    2.5M
#> 4        cli    2.4M
#> 5      utils    2.3M
#> 6    methods    2.1M
#> 7      purrr    836K
#> 8   magrittr    536K
#> 9  lifecycle    376K
#> 10      glue    368K
pkg_size_recursive("dotty")
#>     pkg size
#> 1 dotty  84K
#> 2 total  84K

Here is the use case that really motivated this line of enquiry

env_vars <- list(
  "api_key" = "fbivyb137294hgwv",
  "hello" = "world",
  "current_user" = "josiah@parry.com"
)

n <- length(env_vars)
encoded_keys <- character(n)
encoded_vals <- vector("list", n)

keys <- names(env_vars)
vals <- unlist(env_vars)

for (i in 1:n) {
  k <- keys[i]
  v <- vals[i]
  # note that I do rsa encryption here in my actual use case 
  encoded_keys[i] <- b64::encode(k)
  encoded_vals[[i]] <- b64::encode(v)
}

# to put in REST request
setNames(encoded_vals, encoded_keys)

similar code in rust feels much nicer here. This is partially due to the fact that in R we can preallocate a list but if we try to assign based on a name in a preallocated list it will create a new entry.

use base64::prelude::*;

fn main() {
    let env_vars = [
        ("api_key", "fbivyb137294hgwv"),
        ("hello", "world"),
        ("current_user", "josiah@parry.com"),
    ];

    let mut encrypted = Vec::with_capacity(env_vars.len());

    for (k, v) in env_vars {
        let ke = BASE64_STANDARD.encode(k);
        let ve = BASE64_STANDARD.encode(v);
        encrypted.push((ke, ve))
    }
}

kevinushey commented 2 months ago

What do you think of the proposed syntax?

.enum(object, .(key, val) -> {
  # do stuff with key, val; for-loop semantics apply
})

JosiahParry commented 2 months ago

I quite like it! I'm curious though, how it could also handle an enumeration with keys and index?

For example in Rust (sorry, my second more comfortable language) I might write for (i, (k, v)) in x.into_iter().enumerate()

kevinushey commented 2 months ago

We could allow for an optional third parameter, e.g.

.enum(object, .(key, value, idx) -> { ... })

I'd want to make it the third parameter here just because I think key + value would almost always be useful, but idx may or may not be.

JosiahParry commented 2 months ago

Absolutely. And would this behave similarly to imap / iwalk where if there are no name the element index is provided to key?

kevinushey commented 2 months ago

Yeah, I think that's sensible.

kevinushey / dotty

Use named with dotty #2