r-lib / vctrs

Generic programming with typed R vectors
https://vctrs.r-lib.org
Other
287 stars 66 forks source link

Review `vec_is_list()` check on `vec_proxy()` #1208

Open DavisVaughan opened 4 years ago

DavisVaughan commented 4 years ago

@lionel- and I discussed the concept of a list-rcrd in more detail. Essentially this is a list that has some vectorized attributes that can get sliced alongside the list. We determined that one possible implementation of this idea is (in my own words):

This would currently work in vctrs, except for the fact that vec_is_list() has the following check: If x inherits explicitly from "list", then vec_proxy(x) must also be a list. This should currently fail here (although I think it doesn't because of a bug).

I don't remember exactly why we have that restriction, but it seems reasonable that we might try to remove it to allow these list-rcrds to work.

One possible implementation follows, but as @lionel- mentioned to me, the important part is that this doesn't require any vctrs specific helper, and can live completely outside vctrs.

library(vctrs)

new_list_rcrd <- function(x, vectorized, ..., class = character()) {
  structure(
    x, 
    `vctrs:::vectorized` = vectorized, 
    ..., 
    class = c(class, "vctrs_list_rcrd", "list")
  )
}

vec_proxy.vctrs_list_rcrd <- function(x, ...) {
  vectorized <- attr(x, "vctrs:::vectorized")
  x <- unclass(x)
  cols <- list(`vctrs:::data` = x)
  cols <- c(cols, vectorized)
  new_data_frame(cols)
}

vec_restore.vctrs_list_rcrd <- function(x, to, ...) {
  data <- x[["vctrs:::data"]]
  x[["vctrs:::data"]] <- NULL
  new_list_rcrd(data, x)
}

x <- new_list_rcrd(
  x = list(a = 1:3, b = 2:5), 
  vectorized = list(
    special1 = c(TRUE, FALSE),
    special2 = c("x", "y")
  )
)

x
#> $a
#> [1] 1 2 3
#> 
#> $b
#> [1] 2 3 4 5
#> 
#> attr(,"vctrs:::vectorized")
#> attr(,"vctrs:::vectorized")$special1
#> [1]  TRUE FALSE
#> 
#> attr(,"vctrs:::vectorized")$special2
#> [1] "x" "y"
#> 
#> attr(,"class")
#> [1] "vctrs_list_rcrd" "list"

vec_slice(x, c(1, 1, 2))
#> $a
#> [1] 1 2 3
#> 
#> $a
#> [1] 1 2 3
#> 
#> $b
#> [1] 2 3 4 5
#> 
#> attr(,"vctrs:::vectorized")
#> attr(,"vctrs:::vectorized")$special1
#> [1]  TRUE  TRUE FALSE
#> 
#> attr(,"vctrs:::vectorized")$special2
#> [1] "x" "x" "y"
#> 
#> attr(,"class")
#> [1] "vctrs_list_rcrd" "list"

Created on 2020-08-05 by the reprex package (v0.3.0)

lionel- commented 4 years ago

Probably best to use fields instead of vectorised so it's consistent with new_rcrd().

lionel- commented 4 years ago

Summary of the main ideas:

lionel- commented 4 years ago

Regarding how to implement [[<- and $<- operations in vctrs we could do it in the following way.

For x[[1]] <- y with x a list, we just do vec_slice(x, 1) <- list(y):

It's going to be inefficient to update lists elements by elements, so it's always preferable to vectorise the assignment.

lionel- commented 4 years ago

Regarding a helper class for record lists and record data frames, I think vctrs::new_rcrd() should gain a data = NULL argument. When supplied, the fields would be stored in an attribute.

lionel- commented 4 years ago

1226 adds a list record helper class for unit tests.

1228 implements [[ and [[<- as outlined in https://github.com/r-lib/vctrs/issues/1208#issuecomment-669367413.