Closed leungi closed 6 years ago
this is a special corner case due to row_number
evaluation. You have an error because it is evaluated
as if it applied to the first mutate
applying on nested_data
which is indeed 3 rows long.
Either it is on purpose or there is something off in the tidy evaluation and all the scope things.
Eitherway, you currently have workaround to achieve what you want
library(tidyverse)
nested_data <- mtcars %>%
group_by(cyl) %>%
nest()
Working with a custom defined function works as expected.
add_row_number <- function(x) {x %>% mutate(row_id = row_number())}
nest_data %>%
mutate(data = map(data, add_row_number)) %>%
{.[[1,2]]}
#> # A tibble: 7 x 11
#> mpg disp hp drat wt qsec vs am gear carb row_id
#> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <int>
#> 1 21 160 110 3.9 2.62 16.5 0 1 4 4 1
#> 2 21 160 110 3.9 2.88 17.0 0 1 4 4 2
#> 3 21.4 258 110 3.08 3.22 19.4 1 0 3 1 3
#> 4 18.1 225 105 2.76 3.46 20.2 1 0 3 1 4
#> 5 19.2 168. 123 3.92 3.44 18.3 1 0 4 4 5
#> 6 17.8 168. 123 3.92 3.44 18.9 1 0 4 4 6
#> 7 19.7 145 175 3.62 2.77 15.5 0 1 5 6 7
In your example you defined an anonymous function. Which is different as tidyeval will deal with all the anymous function call whereas here it deals only with the defined function name then evaluates. row_number
is evaluated in the correct context.
Not using row_number
is also an option. here rownames_to_column
can be used because without any defined rownames,
row number are used as rownames
nest_data %>%
mutate(data = map(data, ~.x %>% rownames_to_column("row_id"))) %>%
{.[[1,2]]}
#> # A tibble: 7 x 11
#> row_id mpg disp hp drat wt qsec vs am gear carb
#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 1 21 160 110 3.9 2.62 16.5 0 1 4 4
#> 2 2 21 160 110 3.9 2.88 17.0 0 1 4 4
#> 3 3 21.4 258 110 3.08 3.22 19.4 1 0 3 1
#> 4 4 18.1 225 105 2.76 3.46 20.2 1 0 3 1
#> 5 5 19.2 168. 123 3.92 3.44 18.3 1 0 4 4
#> 6 6 17.8 168. 123 3.92 3.44 18.9 1 0 4 4
#> 7 7 19.7 145 175 3.62 2.77 15.5 0 1 5 6
Created on 2018-09-20 by the reprex package (v0.2.0).
I think there may be something to fix here, but it require digging into tidyeval mechanism inside dplyr and it is not trival for me. Moreover because I think it is somewhere in mutate
C++ code.
I think this is fixed with dev dplyr thanks to @romainfrancois's new hybrid-eval implementation.
Oh I forgot about that. Indeed it is working!
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
library(purrr)
library(tidyr)
packageVersion("dplyr")
#> [1] '0.7.99.9000'
mtcars %>%
group_by(cyl) %>%
nest() -> nest_data
nest_data %>%
mutate(data = map(data, ~.x %>%
mutate(row_id = row_number())))
#> # A tibble: 3 x 2
#> cyl data
#> <dbl> <list>
#> 1 6 <tibble [7 × 11]>
#> 2 4 <tibble [11 × 11]>
#> 3 8 <tibble [14 × 11]>
Created on 2018-09-20 by the reprex package (v0.2.0).
Thanks for the prompt reply and solution @cderv @lionel- !
Just got chance to update dplyr
to dev version and working as described.
As usual, learnt another new tip from the greats :+1:
rownames_to_column
can be used because without any defined rownames, row number are used as rownames
Similar to issue #541, but now on map();
reprex
below.@cderv, I tried your previous solution by explicitly defining function, but no luck.