markfairbanks / tidytable

Tidy interface to 'data.table'
https://markfairbanks.github.io/tidytable/
Other
449 stars 33 forks source link

Error "mutate_rowwise()" several columns. #805

Closed coforfe closed 6 months ago

coforfe commented 6 months ago

Hello Mark,

I am trying to sum several columns at the same time and rowwise and I found this abnormal behaviour. It works when declaring all the different columns, but not when using : between the first and the last column:

iris %>%
  mutate_rowwise( sumall = sum(c(Sepal.Length, Sepal.Width, Petal.Length, Petal.Width))) %>%
  mutate_rowwise( sumtwo = sum(c(Sepal.Length:Petal.Width))) %>%
  mutate_rowwise( sumtre = sum(Sepal.Length:Petal.Width))

Which produces this output:

# A tidytable: 150 × 8
   Sepal.Length Sepal.Width Petal.Length Petal.Width Species sumall sumtwo sumtre
          <dbl>       <dbl>        <dbl>       <dbl> <fct>    <dbl>  <dbl>  <dbl>
 1          5.1         3.5          1.4         0.2 setosa    10.2   15.5   15.5
 2          4.9         3            1.4         0.2 setosa     9.5   14.5   14.5
 3          4.7         3.2          1.3         0.2 setosa     9.4   13.5   13.5
 4          4.6         3.1          1.5         0.2 setosa     9.4   13     13  
 5          5           3.6          1.4         0.2 setosa    10.2   15     15  
 6          5.4         3.9          1.7         0.4 setosa    11.4   17.4   17.4
 7          4.6         3.4          1.4         0.3 setosa     9.7   13     13  
 8          5           3.4          1.5         0.2 setosa    10.1   15     15  
 9          4.4         2.9          1.4         0.2 setosa     8.9   12     12  
10          4.9         3.1          1.5         0.1 setosa     9.6   14.5   14.5
# ℹ 140 more rows
# ℹ Use `print(n = ...)` to see more rows

So, it seems that : is not valid...

Thanks, Carlos.

markfairbanks commented 6 months ago

You're trying to use the colon : in a column selection context (like select(col1:col4)). To get column selection inside of a mutate() you need to use pick() or across().

Also note that using rowSums() will be much faster in this case than a mutate_rowwise() call.

pacman::p_load(tidytable)

iris %>%
  mutate_rowwise(sumall = sum(c(Sepal.Length, Sepal.Width, Petal.Length, Petal.Width)),
                 sumall2 = sum(pick(Sepal.Length:Petal.Width))) %>%
  mutate(sumall3 = rowSums(pick(Sepal.Length:Petal.Width))) %>%
  select(starts_with("sum"))
#> # A tidytable: 150 × 3
#>    sumall sumall2 sumall3
#>     <dbl>   <dbl>   <dbl>
#>  1   10.2    10.2    10.2
#>  2    9.5     9.5     9.5
#>  3    9.4     9.4     9.4
#>  4    9.4     9.4     9.4
#>  5   10.2    10.2    10.2
#>  6   11.4    11.4    11.4
#>  7    9.7     9.7     9.7
#>  8   10.1    10.1    10.1
#>  9    8.9     8.9     8.9
#> 10    9.6     9.6     9.6
#> # ℹ 140 more rows

Hope this helps - if you have any questions let me know.

coforfe commented 6 months ago

Thanks Mark!.

I was not aware of pick().

Thanks again, Carlos