Closed mmuurr closed 3 years ago
I don't have any feedback on the underlying problem but I noticed that the types are
<data.frame
<bar: tbl_df<
bar1: double
bar2: double
>>
>
vs
<tbl_df<
bar1: double
bar2: double
>>
The correct way to specify this is to put your ptype in an extra tibble(bar = ...)
. See below
library(tidyr)
x <- tibble::tibble(
foo = "foo",
bar = list(
tibble::tibble(bar1 = as.double(1:3), bar2 = as.double(3:1))
)
)
unnest(x, bar)
#> # A tibble: 3 x 3
#> foo bar1 bar2
#> <chr> <dbl> <dbl>
#> 1 foo 1 3
#> 2 foo 2 2
#> 3 foo 3 1
unnest(x, bar, ptype = tibble(bar = tibble(bar1 = double(0), bar2 = double(0))))
#> # A tibble: 3 x 3
#> foo bar1 bar2
#> <chr> <dbl> <dbl>
#> 1 foo 1 3
#> 2 foo 2 2
#> 3 foo 3 1
Created on 2020-12-08 by the reprex package (v0.3.0)
The extra tibble()
is quite confusing. On the other hand one could argue it is useful when unnesting multiple columns and needing name_repair
:
library(tidyr)
df <- tibble(
y = list(
tibble(a = 1),
tibble(a = 2)
),
z = list(
tibble(a = "a"),
tibble(a = "b")
)
)
df %>% unnest(c(y, z), names_repair = "unique")
#> New names:
#> * a -> a...1
#> * a -> a...2
#> # A tibble: 2 x 2
#> a...1 a...2
#> <dbl> <chr>
#> 1 1 a
#> 2 2 b
df %>% unnest(
c(y, z),
names_repair = "unique",
ptype = tibble(
y = tibble(a = integer()),
z = tibble(a = character())
)
)
#> New names:
#> * a -> a...1
#> * a -> a...2
#> # A tibble: 2 x 2
#> a...1 a...2
#> <int> <chr>
#> 1 1 a
#> 2 2 b
Created on 2020-12-08 by the reprex package (v0.3.0)
Or in general that one can specify via ptype
where a column should come from.
Note that one has to specify all the columns that should occur when unnesting. I am not sure whether this is intended or not.
library(tidyr)
df <- tibble(
y = list(
tibble(a = 1, b = 1),
tibble(a = 2)
)
)
df %>% unnest(
y,
names_repair = "unique",
ptype = tibble(
y = tibble(a = integer())
)
)
#> Error: Can't convert from <tbl_df<
#> a: double
#> b: double
#> >> to <tbl_df<a:integer>> due to loss of precision.
#> Dropped variables: `b`
Created on 2020-12-08 by the reprex package (v0.3.0)
I don't know if this is an issue with tidyr or with vctrs, or simply an issue with not using the
ptype
argument correctly in various rectangling operations. But it appears from the error message below that a cast fromdouble
todouble
is failing due to precision loss, which is confusing to me. Either that or the lossiness error is due to the containing dataframe/tibble, but in the example below I've tried to construct theptype
argument to match as-closely-as-possible thex$bar
object.Session info
```R sessionInfo() #> R version 3.6.3 (2020-02-29) #> Platform: x86_64-apple-darwin19.4.0 (64-bit) #> Running under: macOS Catalina 10.15.3 #> #> Matrix products: default #> BLAS/LAPACK: /usr/local/Cellar/openblas/0.3.10_1/lib/libopenblasp-r0.3.10.dylib #> #> locale: #> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 #> #> attached base packages: #> [1] stats graphics grDevices datasets utils methods base #> #> other attached packages: #> [1] vctrs_0.3.1 tidyr_1.1.0 #> #> loaded via a namespace (and not attached): #> [1] Rcpp_1.0.5 knitr_1.29 magrittr_1.5 tidyselect_1.1.0 #> [5] R6_2.4.1 rlang_0.4.7 fansi_0.4.1 stringr_1.4.0 #> [9] highr_0.8 dplyr_1.0.0 tools_3.6.3 xfun_0.15 #> [13] utf8_1.1.4 cli_2.0.2 htmltools_0.5.0 ellipsis_0.3.1 #> [17] assertthat_0.2.1 yaml_2.2.1 digest_0.6.25 tibble_3.0.3 #> [21] lifecycle_0.2.0 crayon_1.3.4 purrr_0.3.4 glue_1.4.1 #> [25] evaluate_0.14 rmarkdown_2.3 stringi_1.4.6 compiler_3.6.3 #> [29] pillar_1.4.6 generics_0.0.2 renv_0.11.0 pkgconfig_2.0.3 ```