markfairbanks / tidytable

Tidy interface to 'data.table'
https://markfairbanks.github.io/tidytable/
Other
449 stars 33 forks source link

class is not respected in a grouped mutate #578

Closed bnicenboim closed 2 years ago

bnicenboim commented 2 years ago

Hi, I actually know why this is happening, and not sure if it's fixable (without slowing down the grouped mutate). But if not, it's worth a message or a warning when it happens.

suppressMessages(library(tidytable))
df <- data.frame(a= 1:10, b= 1:10, c= 1:10)
# class is not respected here
df %>% 
  mutate.(a = a > 5, .by = "c")
#> # A tidytable: 10 × 3
#>        a     b     c
#>    <int> <int> <int>
#>  1     0     1     1
#>  2     0     2     2
#>  3     0     3     3
#>  4     0     4     4
#>  5     0     5     5
#>  6     1     6     6
#>  7     1     7     7
#>  8     1     8     8
#>  9     1     9     9
#> 10     1    10    10
# class is respected here:
df %>% 
  mutate.(a = a > 5)
#> # A tidytable: 10 × 3
#>    a         b     c
#>    <lgl> <int> <int>
#>  1 FALSE     1     1
#>  2 FALSE     2     2
#>  3 FALSE     3     3
#>  4 FALSE     4     4
#>  5 FALSE     5     5
#>  6 TRUE      6     6
#>  7 TRUE      7     7
#>  8 TRUE      8     8
#>  9 TRUE      9     9
#> 10 TRUE     10    10

Created on 2022-08-31 with reprex v2.0.2

I do get a warning if I do this (which also doesn't work)

df %>% 
  mutate.(a = as.character(a), .by = "c")
markfairbanks commented 2 years ago

Unfortunately this isn't fixable by me, but if you want you could open an issue in the data.table repo. The actual computation is done by data.table, so however they handle it is how it'll work.

bnicenboim commented 2 years ago

mmh, yeah, but given that data.table works by reference, it's clearer what's going on...

You can still do what I intended with data.table like this:

dt[,.(a = a > 5), by = c][]

(In any case, as I said I kind of assumed this wasn't fixable...)