Closed eipi10 closed 3 years ago
Additional responses on RStudio Community (one by @szimmer, who filed issue #5733) indicate my issue is likely the same problem described in #5733, #5739, and #5765. My reprex above was run with dplyr 1.0.4
. After reading that this issue appears to be fixed in the development version, I installed the development version and the problem went away.
In case it might help in understanding this bug: With dplyr 1.0.4
I ran sessionInfo()
before the first time I used my function and then again after (see reprex below). It turns out that two additional packages, fansi
and utf8
, are loaded into the namespace after the function is run for the first time. In the first call to sessionInfo()
, you can see that there are 49 packages loaded via namespace. In the second call to sessionInfo()
, you can see that there are now 51 packages, and the two new ones are at positions 35 and 47.
library(tidyverse)
fnc = function(data, value.vars, group.vars=NULL) {
data %>%
group_by(across({{group.vars}})) %>%
summarise(n=n(), across({{value.vars}},
list(mean=~mean(.x, na.rm=TRUE),
n.miss=~sum(is.na(.x))),
.names="{.fn}_{.col}"))
}
sessionInfo()
#> R version 4.0.3 (2020-10-10)
#> Platform: x86_64-apple-darwin17.0 (64-bit)
#> Running under: macOS Catalina 10.15.7
#>
#> Matrix products: default
#> BLAS: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRblas.dylib
#> LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib
#>
#> locale:
#> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> other attached packages:
#> [1] forcats_0.5.1 stringr_1.4.0 dplyr_1.0.4 purrr_0.3.4
#> [5] readr_1.4.0 tidyr_1.1.2 tibble_3.0.6 ggplot2_3.3.3
#> [9] tidyverse_1.3.0
#>
#> loaded via a namespace (and not attached):
#> [1] Rcpp_1.0.6 cellranger_1.1.0 pillar_1.4.7 compiler_4.0.3
#> [5] dbplyr_2.1.0 highr_0.8 tools_4.0.3 digest_0.6.27
#> [9] lubridate_1.7.9.2 jsonlite_1.7.2 evaluate_0.14 lifecycle_0.2.0
#> [13] gtable_0.3.0 pkgconfig_2.0.3 rlang_0.4.10 reprex_1.0.0
#> [17] cli_2.3.0 DBI_1.1.1 yaml_2.2.1 haven_2.3.1
#> [21] xfun_0.20 withr_2.4.1 xml2_1.3.2 httr_1.4.2
#> [25] styler_1.3.2 knitr_1.31 hms_1.0.0 generics_0.1.0
#> [29] fs_1.5.0 vctrs_0.3.6 grid_4.0.3 tidyselect_1.1.0
#> [33] glue_1.4.2 R6_2.5.0 readxl_1.3.1 rmarkdown_2.6
#> [37] modelr_0.1.8 magrittr_2.0.1 backports_1.2.1 scales_1.1.1
#> [41] ellipsis_0.3.1 htmltools_0.5.1.1 rvest_0.3.6 assertthat_0.2.1
#> [45] colorspace_2.0-0 stringi_1.5.3 munsell_0.5.0 broom_0.7.4
#> [49] crayon_1.4.0
mtcars %>% fnc(mpg)
#> # A tibble: 1 x 3
#> n mean_mpg n.miss_mpg
#> <int> <dbl> <int>
#> 1 32 20.1 0
sessionInfo()
#> R version 4.0.3 (2020-10-10)
#> Platform: x86_64-apple-darwin17.0 (64-bit)
#> Running under: macOS Catalina 10.15.7
#>
#> Matrix products: default
#> BLAS: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRblas.dylib
#> LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib
#>
#> locale:
#> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> other attached packages:
#> [1] forcats_0.5.1 stringr_1.4.0 dplyr_1.0.4 purrr_0.3.4
#> [5] readr_1.4.0 tidyr_1.1.2 tibble_3.0.6 ggplot2_3.3.3
#> [9] tidyverse_1.3.0
#>
#> loaded via a namespace (and not attached):
#> [1] Rcpp_1.0.6 cellranger_1.1.0 pillar_1.4.7 compiler_4.0.3
#> [5] dbplyr_2.1.0 highr_0.8 tools_4.0.3 digest_0.6.27
#> [9] lubridate_1.7.9.2 jsonlite_1.7.2 evaluate_0.14 lifecycle_0.2.0
#> [13] gtable_0.3.0 pkgconfig_2.0.3 rlang_0.4.10 reprex_1.0.0
#> [17] cli_2.3.0 DBI_1.1.1 yaml_2.2.1 haven_2.3.1
#> [21] xfun_0.20 withr_2.4.1 xml2_1.3.2 httr_1.4.2
#> [25] styler_1.3.2 knitr_1.31 hms_1.0.0 generics_0.1.0
#> [29] fs_1.5.0 vctrs_0.3.6 grid_4.0.3 tidyselect_1.1.0
#> [33] glue_1.4.2 R6_2.5.0 fansi_0.4.2 readxl_1.3.1
#> [37] rmarkdown_2.6 modelr_0.1.8 magrittr_2.0.1 backports_1.2.1
#> [41] scales_1.1.1 ellipsis_0.3.1 htmltools_0.5.1.1 rvest_0.3.6
#> [45] assertthat_0.2.1 colorspace_2.0-0 utf8_1.1.4 stringi_1.5.3
#> [49] munsell_0.5.0 broom_0.7.4 crayon_1.4.0
I believe this was already fixed as part of to be released 1.0.5, in #5765. I'm now getting:
library(tidyverse)
fnc = function(data, value.vars, group.vars=NULL) {
data %>%
group_by(across({{group.vars}})) %>%
summarise(n=n(), across({{value.vars}},
list(mean=~mean(.x, na.rm=TRUE),
n.miss=~sum(is.na(.x))),
.names="{.fn}_{.col}"))
}
mtcars %>% fnc(mpg)
#> # A tibble: 1 x 3
#> n mean_mpg n.miss_mpg
#> <int> <dbl> <int>
#> 1 32 20.1 0
iris %>% fnc(c(Petal.Width, Sepal.Width), Species)
#> # A tibble: 3 x 6
#> Species n mean_Petal.Width n.miss_Petal.Wi… mean_Sepal.Width
#> <fct> <int> <dbl> <int> <dbl>
#> 1 setosa 50 0.246 0 3.43
#> 2 versic… 50 1.33 0 2.77
#> 3 virgin… 50 2.03 0 2.97
#> # … with 1 more variable: n.miss_Sepal.Width <int>
diamonds %>% fnc(c(x,y), c(cut, color))
#> `summarise()` has grouped output by 'cut'. You can override using the `.groups` argument.
#> # A tibble: 35 x 7
#> # Groups: cut [5]
#> cut color n mean_x n.miss_x mean_y n.miss_y
#> <ord> <ord> <int> <dbl> <int> <dbl> <int>
#> 1 Fair D 163 6.02 0 5.96 0
#> 2 Fair E 224 5.91 0 5.86 0
#> 3 Fair F 312 5.99 0 5.93 0
#> 4 Fair G 314 6.17 0 6.11 0
#> 5 Fair H 303 6.58 0 6.50 0
#> 6 Fair I 175 6.56 0 6.49 0
#> 7 Fair J 119 6.75 0 6.68 0
#> 8 Good D 662 5.62 0 5.63 0
#> 9 Good E 933 5.62 0 5.63 0
#> 10 Good F 909 5.69 0 5.71 0
#> # … with 25 more rows
iris %>% fnc(c(Petal.Width, Sepal.Width), Species)
#> # A tibble: 3 x 6
#> Species n mean_Petal.Width n.miss_Petal.Wi… mean_Sepal.Width
#> <fct> <int> <dbl> <int> <dbl>
#> 1 setosa 50 0.246 0 3.43
#> 2 versic… 50 1.33 0 2.77
#> 3 virgin… 50 2.03 0 2.97
#> # … with 1 more variable: n.miss_Sepal.Width <int>
diamonds %>% fnc(c(x,y))
#> # A tibble: 1 x 5
#> n mean_x n.miss_x mean_y n.miss_y
#> <int> <dbl> <int> <dbl> <int>
#> 1 53940 5.73 0 5.73 0
Created on 2021-03-04 by the reprex package (v0.3.0)
I'm having a bizarre problem in which a tidyeval function I wrote works fine the first time I run it with a particular data frame, but usually produces an error on subsequent attempts. I've provided two reprexes below, just to show a couple of different failure modes. This seems like a bug, but maybe there's a problem with my function.
I posted this as a question on RStudio Community. The lone responder thought he remembered a github issue on this, but I haven't been able to find one.
Created on 2021-02-18 by the reprex package (v1.0.0)
Created on 2021-02-18 by the reprex package (v1.0.0)