tidyverse / dtplyr

Data table backend for dplyr
https://dtplyr.tidyverse.org
Other
664 stars 58 forks source link

`filter(.by = )` leads to incorrect results when `NA`s are present #474

Open markfairbanks opened 2 months ago

markfairbanks commented 2 months ago
library(dtplyr)
library(dplyr)

df <- tibble(x = c(1, 2, NA), y = c("a", "a", "b"))

lazy_dt(df) %>%
  filter(x != 2, .by = y)
#> Source: local data table [2 x 2]
#> Call:   `_DT1`[`_DT1`[, .I[x != 2], by = .(y)]$V1]
#> 
#>       x y    
#>   <dbl> <chr>
#> 1     1 a    
#> 2    NA <NA> 
#> 
#> # Use as.data.table()/as.data.frame()/as_tibble() to access results

df %>%
  filter(x != 2, .by = y)
#> # A tibble: 1 × 2
#>       x y    
#>   <dbl> <chr>
#> 1     1 a