tidyverse / dplyr

dplyr: A grammar of data manipulation
https://dplyr.tidyverse.org/
Other
4.76k stars 2.12k forks source link

Rowwise function behaving abnormally #7094

Closed jaymicro closed 1 hour ago

jaymicro commented 4 hours ago

I am attempting to perform a rowwise mean calculation across multiple columns. The function below works fine

df <- tibble(x = runif(6), y = runif(6), z = runif(6))
df %>% rowwise() %>% mutate(m = mean(c(x, y, z)))

However, when I use a colon to specify the columns to perform the mean function R will return the outputs from the first column.

df %>% rowwise() %>% mutate(m = mean(c(x:z)))

Below is the reprex for the issue

library(tidyverse)
#> Warning: package 'tidyverse' was built under R version 4.4.1
#> Warning: package 'ggplot2' was built under R version 4.4.1
#> Warning: package 'tibble' was built under R version 4.4.1
#> Warning: package 'tidyr' was built under R version 4.4.1
#> Warning: package 'readr' was built under R version 4.4.1
#> Warning: package 'purrr' was built under R version 4.4.1
#> Warning: package 'dplyr' was built under R version 4.4.1
#> Warning: package 'stringr' was built under R version 4.4.1
#> Warning: package 'forcats' was built under R version 4.4.1
#> Warning: package 'lubridate' was built under R version 4.4.1
set.seed(111)
df <- tibble(x = runif(6), y = runif(6), z = runif(6))
df %>% rowwise() %>% mutate(m = mean(c(x, y, z)))
#> # A tibble: 6 × 4
#> # Rowwise: 
#>       x      y      z     m
#>   <dbl>  <dbl>  <dbl> <dbl>
#> 1 0.593 0.0107 0.0671 0.224
#> 2 0.726 0.532  0.0475 0.435
#> 3 0.370 0.432  0.156  0.320
#> 4 0.515 0.0937 0.446  0.352
#> 5 0.378 0.556  0.171  0.368
#> 6 0.418 0.590  0.967  0.658
df %>% rowwise() %>% mutate(m = mean(c(x:z)))
#> # A tibble: 6 × 4
#> # Rowwise: 
#>       x      y      z     m
#>   <dbl>  <dbl>  <dbl> <dbl>
#> 1 0.593 0.0107 0.0671 0.593
#> 2 0.726 0.532  0.0475 0.726
#> 3 0.370 0.432  0.156  0.370
#> 4 0.515 0.0937 0.446  0.515
#> 5 0.378 0.556  0.171  0.378
#> 6 0.418 0.590  0.967  0.418

Created on 2024-10-10 with reprex v2.1.1

Standard output and standard error ``` sh -- nothing to show -- ```
Session info ``` r sessioninfo::session_info() #> ─ Session info ─────────────────────────────────────────────────────────────── #> setting value #> version R version 4.4.0 (2024-04-24 ucrt) #> os Windows 10 x64 (build 19045) #> system x86_64, mingw32 #> ui RTerm #> language (EN) #> collate English_Canada.utf8 #> ctype English_Canada.utf8 #> tz America/Vancouver #> date 2024-10-10 #> pandoc 3.1.11 @ C:/Program Files/RStudio/resources/app/bin/quarto/bin/tools/ (via rmarkdown) #> #> ─ Packages ─────────────────────────────────────────────────────────────────── #> package * version date (UTC) lib source #> cli 3.6.3 2024-06-21 [1] CRAN (R 4.4.1) #> colorspace 2.1-1 2024-07-26 [1] CRAN (R 4.4.1) #> digest 0.6.36 2024-06-23 [1] CRAN (R 4.4.1) #> dplyr * 1.1.4 2023-11-17 [1] CRAN (R 4.4.1) #> evaluate 0.24.0 2024-06-10 [1] CRAN (R 4.4.1) #> fansi 1.0.6 2023-12-08 [1] CRAN (R 4.4.1) #> fastmap 1.2.0 2024-05-15 [1] CRAN (R 4.4.1) #> forcats * 1.0.0 2023-01-29 [1] CRAN (R 4.4.1) #> fs 1.6.4 2024-04-25 [1] CRAN (R 4.4.1) #> generics 0.1.3 2022-07-05 [1] CRAN (R 4.4.1) #> ggplot2 * 3.5.1 2024-04-23 [1] CRAN (R 4.4.1) #> glue 1.7.0 2024-01-09 [1] CRAN (R 4.4.1) #> gtable 0.3.5 2024-04-22 [1] CRAN (R 4.4.1) #> hms 1.1.3 2023-03-21 [1] CRAN (R 4.4.1) #> htmltools 0.5.8.1 2024-04-04 [1] CRAN (R 4.4.1) #> knitr 1.48 2024-07-07 [1] CRAN (R 4.4.1) #> lifecycle 1.0.4 2023-11-07 [1] CRAN (R 4.4.1) #> lubridate * 1.9.3 2023-09-27 [1] CRAN (R 4.4.1) #> magrittr 2.0.3 2022-03-30 [1] CRAN (R 4.4.1) #> munsell 0.5.1 2024-04-01 [1] CRAN (R 4.4.1) #> pillar 1.9.0 2023-03-22 [1] CRAN (R 4.4.1) #> pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.4.1) #> purrr * 1.0.2 2023-08-10 [1] CRAN (R 4.4.1) #> R6 2.5.1 2021-08-19 [1] CRAN (R 4.4.1) #> readr * 2.1.5 2024-01-10 [1] CRAN (R 4.4.1) #> reprex 2.1.1 2024-07-06 [1] CRAN (R 4.4.1) #> rlang 1.1.4 2024-06-04 [1] CRAN (R 4.4.1) #> rmarkdown 2.27 2024-05-17 [1] CRAN (R 4.4.1) #> rstudioapi 0.16.0 2024-03-24 [1] CRAN (R 4.4.1) #> scales 1.3.0 2023-11-28 [1] CRAN (R 4.4.1) #> sessioninfo 1.2.2 2021-12-06 [1] CRAN (R 4.4.1) #> stringi 1.8.4 2024-05-06 [1] CRAN (R 4.4.0) #> stringr * 1.5.1 2023-11-14 [1] CRAN (R 4.4.1) #> tibble * 3.2.1 2023-03-20 [1] CRAN (R 4.4.1) #> tidyr * 1.3.1 2024-01-24 [1] CRAN (R 4.4.1) #> tidyselect 1.2.1 2024-03-11 [1] CRAN (R 4.4.1) #> tidyverse * 2.0.0 2023-02-22 [1] CRAN (R 4.4.1) #> timechange 0.3.0 2024-01-18 [1] CRAN (R 4.4.1) #> tzdb 0.4.0 2023-05-12 [1] CRAN (R 4.4.1) #> utf8 1.2.4 2023-10-22 [1] CRAN (R 4.4.1) #> vctrs 0.6.5 2023-12-01 [1] CRAN (R 4.4.1) #> withr 3.0.1 2024-07-31 [1] CRAN (R 4.4.1) #> xfun 0.46 2024-07-18 [1] CRAN (R 4.4.1) #> yaml 2.3.10 2024-07-26 [1] CRAN (R 4.4.1) #> #> [1] C:/Users/jsingh/Documents/R/win-library/4.4 #> [2] C:/Program Files/R/R-4.4.0/library #> #> ────────────────────────────────────────────────────────────────────────────── ```
joranE commented 2 hours ago

I believe the intent is for people to use c_across in those circumstances when you want the tidyselect semantics:

df %>% rowwise() %>% mutate(m = mean(c_across(x:z)))
jaymicro commented 1 hour ago

That makes sense. Thank you!