ProjectMOSAIC / mosaic

Project MOSAIC R package
http://mosaic-web.org/
93 stars 26 forks source link

Maybe a bug? #751

Closed NMarkgraf closed 4 years ago

NMarkgraf commented 4 years ago

Hi!

The following code(-example) runs perfectly under mosiac 1.5 but raises an error in moasic 1.6:

library(mosaic)
df <- data.frame(x = 1:10, y= 1:10)
df %>% cor(x ~ y)

as a workaround I found:

library(mosaic)
df <- data.frame(x = 1:10, y= 1:10)
df %>% cor(x ~ y, data=.)

Is this a bug or a feature? And if so, where is it documented? ;-)

Yours Norman

rpruim commented 4 years ago

I cannot reproduce your 1.5 behavior. Can you use reprex::reprex() and include some session info?

In any case, I don't think we ever supported or documented your hoped for use case. Your "workaround" is the correct way to do this.

library(mosaic)
df <- data.frame(x = 1:10, y= 1:10)
df %>% cor(x ~ y)
#> Error in stats::cor(x, y, ...): 'y' must be numeric

sessionInfo()
#> R version 3.6.2 (2019-12-12)
#> Platform: x86_64-pc-linux-gnu (64-bit)
#> Running under: Ubuntu 18.04.3 LTS
#> 
#> Matrix products: default
#> BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.7.1
#> LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.7.1
#> 
#> locale:
#>  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
#>  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
#>  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
#>  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
#>  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
#> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> other attached packages:
#> [1] mosaic_1.5.0      Matrix_1.2-18     mosaicData_0.17.0 ggformula_0.9.2  
#> [5] ggstance_0.3.3    ggplot2_3.2.1     lattice_0.20-38   dplyr_0.8.4      
#> 
#> loaded via a namespace (and not attached):
#>  [1] tidyselect_1.0.0  xfun_0.12         purrr_0.3.3       splines_3.6.2    
#>  [5] colorspace_1.4-1  vctrs_0.2.2       generics_0.0.2    htmltools_0.4.0  
#>  [9] yaml_2.2.1        rlang_0.4.4       pillar_1.4.2      later_1.0.0      
#> [13] glue_1.3.1        withr_2.1.2       lifecycle_0.1.0   mosaicCore_0.6.0 
#> [17] stringr_1.4.0     munsell_0.5.0     gtable_0.3.0      htmlwidgets_1.5.1
#> [21] evaluate_0.14     knitr_1.27        fastmap_1.0.1     httpuv_1.5.2     
#> [25] crosstalk_1.0.0   highr_0.8         broom_0.5.4       Rcpp_1.0.3       
#> [29] readr_1.3.1       xtable_1.8-4      scales_1.0.0      backports_1.1.5  
#> [33] promises_1.1.0    leaflet_2.0.3     mime_0.8          gridExtra_2.3    
#> [37] hms_0.5.3         digest_0.6.23     stringi_1.4.5     ggrepel_0.8.1    
#> [41] shiny_1.4.0       grid_3.6.2        tools_3.6.2       magrittr_1.5     
#> [45] lazyeval_0.2.2    tibble_2.1.3      ggdendro_0.1-20   crayon_1.3.4     
#> [49] tidyr_1.0.2       pkgconfig_2.0.2   MASS_7.3-51.5     assertthat_0.2.1 
#> [53] rmarkdown_2.1     R6_2.4.1          nlme_3.1-143      compiler_3.6.2

Created on 2020-03-07 by the reprex package (v0.3.0)

NMarkgraf commented 4 years ago

Dear Randall,

strange, very strange. -- But you are right! My code was a little bit too short!

Here is a better example, with sessenInfo etc. First with mosiac 1.5.0 installed and then with mosaic 1.6.0:

suppressMessages(library(mosaic))
df <- data.frame(x = 1:10, y = 1:10)
df %>% summarise(cor(x ~ y))
#>   cor(x ~ y)
#> 1          1

sessionInfo()
#> R version 3.6.2 (2019-12-12)
#> Platform: x86_64-apple-darwin15.6.0 (64-bit)
#> Running under: macOS High Sierra 10.13.6
#> 
#> Matrix products: default
#> BLAS:   /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRblas.0.dylib
#> LAPACK: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRlapack.dylib
#> 
#> locale:
#> [1] de_DE.UTF-8/de_DE.UTF-8/de_DE.UTF-8/C/de_DE.UTF-8/de_DE.UTF-8
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> other attached packages:
#> [1] mosaic_1.5.0      Matrix_1.2-18     mosaicData_0.17.0 ggformula_0.9.4  
#> [5] ggstance_0.3.3    ggplot2_3.2.1     lattice_0.20-38   dplyr_0.8.4      
#> 
#> loaded via a namespace (and not attached):
#>  [1] tidyselect_1.0.0  xfun_0.12         purrr_0.3.3       splines_3.6.2    
#>  [5] colorspace_1.4-1  vctrs_0.2.3       generics_0.0.2    htmltools_0.4.0  
#>  [9] yaml_2.2.1        rlang_0.4.4       later_1.0.0       pillar_1.4.3     
#> [13] glue_1.3.1        withr_2.1.2       tweenr_1.0.1      lifecycle_0.1.0  
#> [17] mosaicCore_0.6.0  stringr_1.4.0     munsell_0.5.0     gtable_0.3.0     
#> [21] htmlwidgets_1.5.1 evaluate_0.14     knitr_1.28        fastmap_1.0.1    
#> [25] httpuv_1.5.2      crosstalk_1.0.0   highr_0.8         broom_0.5.4      
#> [29] Rcpp_1.0.3        readr_1.3.1       xtable_1.8-4      promises_1.1.0   
#> [33] scales_1.1.0      backports_1.1.5   leaflet_2.0.3     mime_0.9         
#> [37] farver_2.0.3      gridExtra_2.3     hms_0.5.3         ggforce_0.3.1    
#> [41] digest_0.6.24     stringi_1.4.5     shiny_1.4.0       ggrepel_0.8.1    
#> [45] polyclip_1.10-0   grid_3.6.2        tools_3.6.2       magrittr_1.5     
#> [49] lazyeval_0.2.2    tibble_2.1.3      ggdendro_0.1-20   crayon_1.3.4     
#> [53] tidyr_1.0.2       pkgconfig_2.0.3   MASS_7.3-51.5     assertthat_0.2.1 
#> [57] rmarkdown_2.1     R6_2.4.1          nlme_3.1-143      compiler_3.6.2

Created on 2020-03-07 by the reprex package (v0.3.0)

And

suppressMessages(library(mosaic))
df <- data.frame(x = 1:10, y = 1:10)
df %>% summarise(cor(x ~ y))
#> Error in stats::cor(x, y, ...): supply both 'x' and 'y' or a matrix-like 'x'

sessionInfo()
#> R version 3.6.2 (2019-12-12)
#> Platform: x86_64-apple-darwin15.6.0 (64-bit)
#> Running under: macOS High Sierra 10.13.6
#> 
#> Matrix products: default
#> BLAS:   /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRblas.0.dylib
#> LAPACK: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRlapack.dylib
#> 
#> locale:
#> [1] de_DE.UTF-8/de_DE.UTF-8/de_DE.UTF-8/C/de_DE.UTF-8/de_DE.UTF-8
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> other attached packages:
#> [1] mosaic_1.6.0      Matrix_1.2-18     mosaicData_0.17.0 ggformula_0.9.4  
#> [5] ggstance_0.3.3    ggplot2_3.2.1     lattice_0.20-38   dplyr_0.8.4      
#> 
#> loaded via a namespace (and not attached):
#>  [1] tidyselect_1.0.0  xfun_0.12         purrr_0.3.3       splines_3.6.2    
#>  [5] colorspace_1.4-1  vctrs_0.2.3       generics_0.0.2    htmltools_0.4.0  
#>  [9] yaml_2.2.1        rlang_0.4.4       later_1.0.0       pillar_1.4.3     
#> [13] glue_1.3.1        withr_2.1.2       tweenr_1.0.1      lifecycle_0.1.0  
#> [17] mosaicCore_0.6.0  stringr_1.4.0     munsell_0.5.0     gtable_0.3.0     
#> [21] htmlwidgets_1.5.1 evaluate_0.14     knitr_1.28        fastmap_1.0.1    
#> [25] httpuv_1.5.2      crosstalk_1.0.0   highr_0.8         broom_0.5.4      
#> [29] Rcpp_1.0.3        readr_1.3.1       xtable_1.8-4      promises_1.1.0   
#> [33] scales_1.1.0      backports_1.1.5   leaflet_2.0.3     mime_0.9         
#> [37] farver_2.0.3      gridExtra_2.3     hms_0.5.3         ggforce_0.3.1    
#> [41] digest_0.6.24     stringi_1.4.5     shiny_1.4.0       ggrepel_0.8.1    
#> [45] polyclip_1.10-0   grid_3.6.2        tools_3.6.2       magrittr_1.5     
#> [49] lazyeval_0.2.2    tibble_2.1.3      ggdendro_0.1-20   crayon_1.3.4     
#> [53] tidyr_1.0.2       pkgconfig_2.0.3   MASS_7.3-51.5     assertthat_0.2.1 
#> [57] rmarkdown_2.1     R6_2.4.1          nlme_3.1-143      compiler_3.6.2

Created on 2020-03-07 by the reprex package (v0.3.0)

Sorry for the wrong example, my fault!

But as you may see now, the only difference (I see) is mosiac (1.5.0 works, 1.6.0 raises an error).

Thanks in advance Norman

rpruim commented 4 years ago

This makes more sense now and should be considered a bug.

Looks like it was introduced in the conversion from lazyeval to rlang. Apparently rlang::eval_tidy()'s default for the env argument doesn't work quite the same way that the lazyeval equivalent did.

Explicitly setting that environment fixes the problem:

suppressMessages(library(mosaic))
data.frame(x= 1:10, y = 1:10) %>% summarise(cor = cor(y ~ x))
#>   cor
#> 1   1

Created on 2020-03-07 by the reprex package (v0.3.0)

I'll have to do a bit more testing (and see if there are any other places affected by this), but this looks like it can be easily fixed. You can try it now with

devtools::install_github('ProjectMOSAIC/mosaic', ref = 'beta')
NMarkgraf commented 4 years ago

Thank you very much!

I've tried the beta and it works perfectly for me.

rpruim commented 4 years ago

I've modified the implementation (using environment determined by the formula rather than by the caller) and created the following unit test:

  expect_equivalent(
    data.frame(x = 1:10, y = 1:10) %>% 
      summarise(mean = mean(~x), cor = cor(y ~ x)),
    data.frame(x = 1:10, y = 1:10) %>% 
      summarise(mean = base::mean(x), cor = stats::cor(y, x))
  )
NMarkgraf commented 4 years ago

Perfect. - Thanks!