tidyverse / dplyr

dplyr: A grammar of data manipulation
https://dplyr.tidyverse.org/
Other
4.75k stars 2.12k forks source link

filter() with two conditions freezes R #4049

Closed prokulski closed 5 years ago

prokulski commented 5 years ago

This

iris %>% filter(Sepal.Length > 7) %>% filter(Petal.Length < 6)

works fine (gives two rows), but this one:

iris %>% filter(Sepal.Length > 7, Petal.Length < 6)

freezes R...

I'm using dplyr v0.7.99.9000, installed from github just before test.

> sessionInfo()
R version 3.5.1 (2018-07-02)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 18.04.1 LTS

Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.7.1
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.7.1

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8    LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] dplyr_0.7.99.9000

loaded via a namespace (and not attached):
 [1] tidyselect_0.2.5   compiler_3.5.1     magrittr_1.5       assertthat_0.2.0   R6_2.3.0           tools_3.5.1       
 [7] pillar_1.3.1.9000  glue_1.3.0         rstudioapi_0.8     tibble_1.4.99.9006 crayon_1.3.4       Rcpp_1.0.0        
[13] packrat_0.5.0      pkgconfig_2.0.2    rlang_0.3.0.9002   purrr_0.2.5.9000  
batpigandme commented 5 years ago

Shouldn't freeze R, but, FWIW, the filter() syntax uses an ampersand for multiple criteria as you're using it, I believe:

library(dplyr)
iris %>% filter(Sepal.Length > 7 & Petal.Length < 6)
#>   Sepal.Length Sepal.Width Petal.Length Petal.Width   Species
#> 1          7.1           3          5.9         2.1 virginica
#> 2          7.2           3          5.8         1.6 virginica

Created on 2018-12-21 by the reprex package (v0.2.1.9000)

prokulski commented 5 years ago

ampersand works. Stragne, somedays ago comma also worked fine

krlmlr commented 5 years ago

Works for me on a fresh install of bbd84bc. Can you please reinstall and try again?

romainfrancois commented 5 years ago

Also works for me too. Can you run this through a reprex to make sure something else in the session is not interfering:

library(dplyr)

iris %>% 
  filter(Sepal.Length > 7, Petal.Length < 6)
#>   Sepal.Length Sepal.Width Petal.Length Petal.Width   Species
#> 1          7.1           3          5.9         2.1 virginica
#> 2          7.2           3          5.8         1.6 virginica

iris %>% 
  filter(Sepal.Length > 7) %>% 
  filter(Petal.Length < 6)
#>   Sepal.Length Sepal.Width Petal.Length Petal.Width   Species
#> 1          7.1           3          5.9         2.1 virginica
#> 2          7.2           3          5.8         1.6 virginica

Created on 2018-12-21 by the reprex package (v0.2.1.9000)

prokulski commented 5 years ago

Still the same :( Upgraded R to 3.5.2, all packages, and nothing :( Console:


> library(dplyr)

Attaching package: ‘dplyr’

The following objects are masked from ‘package:stats’:

    filter, lag

The following objects are masked from ‘package:base’:

    intersect, setdiff, setequal, union

> 
> iris %>% 
+   filter(Sepal.Length > 7) %>% 
+   filter(Petal.Length < 6)
  Sepal.Length Sepal.Width Petal.Length Petal.Width   Species
1          7.1           3          5.9         2.1 virginica
2          7.2           3          5.8         1.6 virginica
> 
> iris %>% 
+   filter(Sepal.Length > 7 & Petal.Length < 6)
  Sepal.Length Sepal.Width Petal.Length Petal.Width   Species
1          7.1           3          5.9         2.1 virginica
2          7.2           3          5.8         1.6 virginica
> 
> iris %>% 
+   filter(Sepal.Length > 7, Petal.Length < 6)

and then it stops

yutannihilation commented 5 years ago

This reproduces on my Windows with the current master (45894214a543e63c01f66e8b1fb4cc77a5978f40).

romainfrancois commented 5 years ago

I can reproduce with level rlang, @lionel-

lionel- commented 5 years ago

I'll see if that's the same bug reported by Davis.

lionel- commented 5 years ago

Confirmed that this is the same symptom as reported by @DavisVaughan. The infloop has been introduced by https://github.com/r-lib/rlang/commit/6ebe1e2296513fe123e23f6a152ae33df2422903, but is ultimately due to dplyr bypassing rlang's tidy eval API.

romainfrancois commented 5 years ago

Thanks. Will have a look in the next few days.

lionel- commented 5 years ago

I got this far: https://github.com/tidyverse/dplyr/compare/master...lionel-:fix-tidyeval-infloop

We should:

I think the overall approach is sound since that fixes the infloop and all mutate and summarise tests pass. However I get some failures in arrange() tests and elsewhere. Romain could you take a look please?

Edit: Will need to replace the Rcpp::unwindProtect() call by Rcpp::Rcpp_fast_eval(<rlang::eval_tidy(quote(<expr>), <env>>) on R < 3.5.

prokulski commented 5 years ago

I found, that R freezes in case of use top_n() also. I am guessing that the problem is on the rlang side.

lionel- commented 5 years ago

The problem is on the dplyr side, see previous message.

lock[bot] commented 5 years ago

This old issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with reprex) and link to this issue. https://reprex.tidyverse.org/