tidyverse / funs

Collection of low-level functions for working with vctrs
Other
34 stars 7 forks source link

Atomic constructors #45

Closed DavisVaughan closed 3 years ago

DavisVaughan commented 4 years ago

I'm not sure if {funs} is the right place for it, but it seems like the vec() and dbl() constructors at the very least could live here. Not sure about flatten_vec() and as_double().

I think we are all mainly on the same page about what dbl() should do, but I wanted to outline implementations for it, and how it would connect to map(). Essentially:

map_dbl() == as_double(map())

flat_map_dbl() == dbl(map())

I had implemented a rough draft of a new flatten() here, but I've since realized it is essentially rlang::flatten() in 99% of the cases, so I've used that below instead.

The semantics of dbl() here seem to be exactly the same as with rlang::dbl(), but it goes through vctrs.

There are 2 issues that need to be fixed first. I've added them at the end. One with {rlang} and one with {vctrs}.

library(purrr)
library(vctrs)
library(rlang, warn.conflicts = FALSE)

as_vector <- function(x, ptype) {
  vec_cast(x, ptype)
}

as_double <- function(x) {
  as_vector(x, double())
}

flatten_vec <- function(x, ptype = NULL) {
  x <- flatten(x)
  vec_c(!!! x, .ptype = ptype)
}

vec <- function(..., .ptype = NULL) {
  x <- list2(...)
  flatten_vec(x, .ptype)
}

dbl <- function(...) {
  vec(..., .ptype = double())
}

as_double(c(1L, 2L))
#> [1] 1 2

as_double(list(1, 2, 3))
#> [1] 1 2 3

as_double(list(1:2, 3))
#> Error: Lossy cast from <list> to <double>.
#> * Locations: 1

dbl(1:2, 3)
#> [1] 1 2 3

dbl(list(1, 2, 3))
#> [1] 1 2 3

dbl(list(1:2, 3))
#> [1] 1 2 3

# map_dbl() is map() + as_double()
as_double(map(1:5, ~.x))
#> [1] 1 2 3 4 5

# it is strict, elements must be size 1
as_double(map(1:5, ~c(.x, .x)))
#> Error: Lossy cast from <list> to <double>.
#> * Locations: 1, 2, 3, 4, 5

# flat_map_dbl() is map() + dbl()
# it is less strict on the element size restraint
dbl(map(1:5, ~c(.x, .x)))
#>  [1] 1 1 2 2 3 3 4 4 5 5

# This will be disallowed by:
# https://github.com/r-lib/rlang/issues/885
flatten_vec(data.frame(x = 1), integer())
#> x 
#> 1

# This will be disallowed by:
# https://github.com/r-lib/vctrs/issues/738
# We only want 1 layer of list auto-splicing
dbl(1, list(list(1)))
#> [1] 1 1

Created on 2020-01-09 by the reprex package (v0.3.0.9000)

lionel- commented 4 years ago

One difference between dbl() and as_double() is that the former should use restricted coercion via the .ptype argument, and the latter should use unrestricted conversions, possibly via a new vec_force() generic.

lionel- commented 4 years ago

Unclear whether we really need as_double() and vec_force() instead of single dispatch as.double().

lionel- commented 4 years ago

I think we need a new flatten() with the same semantics as in rlang, but generic over S3 lists (is_list(x) && vec_is(x)) and with name-spec argument.

hadley commented 3 years ago

This feels out of scope for funs for me, since it doesn't seem to be something that you'd commonly use during a data analysis.