Open JosiahParry opened 2 months ago
I think the best solution in these scenarios is a helper like enumerate
:
enumerate <- function(x, f, ..., FUN.VALUE = NULL) {
n <- names(x)
idx <- `names<-`(seq_along(x), n)
callback <- function(i) f(n[[i]], x[[i]], ...)
if (is.environment(x))
x <- as.list(x, all.names = TRUE)
if (is.null(FUN.VALUE))
lapply(idx, callback)
else
vapply(idx, callback, FUN.VALUE = FUN.VALUE)
}
The main downside being that you lose out on the control flow options normally available within a for
loop.
What would you think of an alternate syntax like the following?
dotty_enum(env_vars, .[key, value] <- {
# do something; behavior like a for loop
})
The syntax is a little magical looking, but it's relatively lightweight, and the magic helps make it clear that this isn't a "regular" function invocation. This seems to work as expected:
env_vars <- list(
"api_key" = "fbivyb137294hgwv",
"hello" = "world",
"current_user" = "josiah@parry.com"
)
dotty_enum <- function(values, dotty) {
# Make sure we received a dotty assignment call.
dotty <- substitute(dotty)
ok <-
identical(dotty[[1L]], as.symbol("<-")) &&
is.call(dotty[[2L]]) &&
identical(dotty[[2L]][[1L]], as.symbol("[")) &&
identical(dotty[[2L]][[2L]], as.symbol("."))
if (!ok)
stop("expected a dotty expression of the form `.[key, value] <- {}`")
# Pull out the dotty variables and dotty expression.
keyvar <- as.character(dotty[[2L]][[3L]])
valvar <- as.character(dotty[[2L]][[4L]])
expr <- dotty[[3L]]
# Create an expression that we can evaluate in the parent frame.
# Since we're creating variables, we'll need to be careful about
# cleanup. We do this so that control flow constructs work as expected.
data <- list(
..values.. = substitute(values),
..keyvar.. = keyvar,
..valvar.. = valvar,
..expr.. = expr
)
expr <- substitute(env = data, {
for (..i.. in seq_along(..values..)) {
assign(..keyvar.., names(..values..)[[..i..]])
assign(..valvar.., ..values..[[..i..]])
evalq(..expr..)
}
})
eval(expr, envir = parent.frame())
}
dotty_enum(env_vars, .[key, value] <- {
str(list(key, value))
if (key == "hello")
break
})
That gives the expected:
> dotty_enum(env_vars, .[key, value] <- {
+ str(list(key, value))
+ if (key == "hello")
+ break
+ })
List of 2
$ : chr "api_key"
$ : chr "fbivyb137294hgwv"
List of 2
$ : chr "hello"
$ : chr "world"
Still, the question is whether this is more useful than purrr::imap
-- the main benefit with this approach is the for
-loop semantics + ability to work directly in the current frame?
This is better than nothing, I agree. For one, purrr is a rather heavy dependency whereas dotty is a tiny base R dep.
There are also some cases where I feel that the ergonomics of a for loops are just better—particularly when running code for their side effects.
pkg_size_recursive("purrr")
#> pkg size
#> 1 total 15.27M
#> 2 vctrs 3.9M
#> 3 rlang 2.5M
#> 4 cli 2.4M
#> 5 utils 2.3M
#> 6 methods 2.1M
#> 7 purrr 836K
#> 8 magrittr 536K
#> 9 lifecycle 376K
#> 10 glue 368K
pkg_size_recursive("dotty")
#> pkg size
#> 1 dotty 84K
#> 2 total 84K
Here is the use case that really motivated this line of enquiry
env_vars <- list(
"api_key" = "fbivyb137294hgwv",
"hello" = "world",
"current_user" = "josiah@parry.com"
)
n <- length(env_vars)
encoded_keys <- character(n)
encoded_vals <- vector("list", n)
keys <- names(env_vars)
vals <- unlist(env_vars)
for (i in 1:n) {
k <- keys[i]
v <- vals[i]
# note that I do rsa encryption here in my actual use case
encoded_keys[i] <- b64::encode(k)
encoded_vals[[i]] <- b64::encode(v)
}
# to put in REST request
setNames(encoded_vals, encoded_keys)
similar code in rust feels much nicer here. This is partially due to the fact that in R we can preallocate a list but if we try to assign based on a name in a preallocated list it will create a new entry.
use base64::prelude::*;
fn main() {
let env_vars = [
("api_key", "fbivyb137294hgwv"),
("hello", "world"),
("current_user", "josiah@parry.com"),
];
let mut encrypted = Vec::with_capacity(env_vars.len());
for (k, v) in env_vars {
let ke = BASE64_STANDARD.encode(k);
let ve = BASE64_STANDARD.encode(v);
encrypted.push((ke, ve))
}
}
What do you think of the proposed syntax?
.enum(object, .(key, val) -> {
# do stuff with key, val; for-loop semantics apply
})
I quite like it! I'm curious though, how it could also handle an enumeration with keys and index?
For example in Rust (sorry, my second more comfortable language) I might write for (i, (k, v)) in x.into_iter().enumerate()
We could allow for an optional third parameter, e.g.
.enum(object, .(key, value, idx) -> { ... })
I'd want to make it the third parameter here just because I think key
+ value
would almost always be useful, but idx
may or may not be.
Absolutely. And would this behave similarly to imap
/ iwalk
where if there are no name the element index is provided to key?
Yeah, I think that's sensible.
I would really love to be able to access an object's names while destructuring. I was hoping that by using
names(value)
i could accomplish this due to the assignment method from dotty, but it isn't yet possible.Here is a repro use case:
In an ideal world it could look something like