tidyverse / dplyr

dplyr: A grammar of data manipulation
https://dplyr.tidyverse.org/
Other
4.78k stars 2.12k forks source link

`if_all` and `if_any` behaving oddly with `!` operator? #6619

Closed H-Mateus closed 1 year ago

H-Mateus commented 1 year ago

Apologies if this is me just being thick, but whilst trying to filter out rows where all numeric cols are 0, I noticed using != 0 seems to invert the behaviour or if_all and if_any.

I'm not sure if this is the expected behaviour, but I found switching to use ! for the whole statement with == 0 works as I assume it should

library(dplyr, warn.conflicts = FALSE)

df <- iris

# Make a row where all numerics are 0
df[1, 1:4] <- 0
# Make a row where one numeric is 0
df[2,1] <- 0

head(df)
#>   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#> 1          0.0         0.0          0.0         0.0  setosa
#> 2          0.0         3.0          1.4         0.2  setosa
#> 3          4.7         3.2          1.3         0.2  setosa
#> 4          4.6         3.1          1.5         0.2  setosa
#> 5          5.0         3.6          1.4         0.2  setosa
#> 6          5.4         3.9          1.7         0.4  setosa

# This should only filter out the first row of all 0s, instead it does what
# if_any should do?
df %>%
  dplyr::filter(if_all(where(is.numeric), ~ . != 0)) %>%
  head(3)
#>   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#> 1          4.7         3.2          1.3         0.2  setosa
#> 2          4.6         3.1          1.5         0.2  setosa
#> 3          5.0         3.6          1.4         0.2  setosa

# But if_any should filter out the first 2 rows, instead it does what if_all
# should do?
df %>%
  dplyr::filter(if_any(where(is.numeric), ~ . != 0)) %>%
  head(3)
#>   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#> 1          0.0         3.0          1.4         0.2  setosa
#> 2          4.7         3.2          1.3         0.2  setosa
#> 3          4.6         3.1          1.5         0.2  setosa

# the following work how I'd expect though
df %>%
  dplyr::filter(!if_all(where(is.numeric), ~ . == 0)) %>%
  head(3)
#>   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#> 1          0.0         3.0          1.4         0.2  setosa
#> 2          4.7         3.2          1.3         0.2  setosa
#> 3          4.6         3.1          1.5         0.2  setosa
df %>%
  dplyr::filter(!if_any(where(is.numeric), ~ . == 0)) %>%
  head(3)
#>   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#> 1          4.7         3.2          1.3         0.2  setosa
#> 2          4.6         3.1          1.5         0.2  setosa
#> 3          5.0         3.6          1.4         0.2  setosa

sessionInfo()
#> R version 4.2.1 (2022-06-23)
#> Platform: aarch64-apple-darwin20 (64-bit)
#> Running under: macOS Monterey 12.6
#> 
#> Matrix products: default
#> BLAS:   /Library/Frameworks/R.framework/Versions/4.2-arm64/Resources/lib/libRblas.0.dylib
#> LAPACK: /Library/Frameworks/R.framework/Versions/4.2-arm64/Resources/lib/libRlapack.dylib
#> 
#> locale:
#> [1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> other attached packages:
#> [1] dplyr_1.0.10
#> 
#> loaded via a namespace (and not attached):
#>  [1] compiler_4.2.1    pillar_1.8.1      highr_0.9         R.methodsS3_1.8.2
#>  [5] R.utils_2.12.2    tools_4.2.1       digest_0.6.31     evaluate_0.19    
#>  [9] lifecycle_1.0.3   tibble_3.1.8      R.cache_0.16.0    pkgconfig_2.0.3  
#> [13] rlang_1.0.6       reprex_2.0.2      cli_3.4.1         DBI_1.1.3        
#> [17] rstudioapi_0.14   yaml_2.3.6        xfun_0.35         fastmap_1.1.0    
#> [21] withr_2.5.0       styler_1.8.1      stringr_1.5.0     knitr_1.41       
#> [25] generics_0.1.3    fs_1.5.2          vctrs_0.5.1       tidyselect_1.2.0 
#> [29] glue_1.6.2        R6_2.5.1          fansi_1.0.3       rmarkdown_2.19   
#> [33] purrr_0.3.5       magrittr_2.0.3    htmltools_0.5.4   assertthat_0.2.1 
#> [37] utf8_1.2.2        stringi_1.7.8     R.oo_1.25.0

Created on 2022-12-19 with reprex v2.0.2

hadley commented 1 year ago

!if_any(where(is.numeric), ~ . == 0) is equivalent to if_all(where(is.numeric), ~ . != 0) because of De Morgan's laws.