ProjectMOSAIC / mosaic

Project MOSAIC R package
http://mosaic-web.org/
93 stars 26 forks source link

tally() doesn't look in surrounding environment #171

Open dtkaplan opened 11 years ago

dtkaplan commented 11 years ago

The first tally statement throws an error.

mod = lm( wage ~ sector*sex+ educ + exper, data=CPS85 )
real = r.squared(mod)
real

Now go to Planet Null

s = do(1000)*lm(wage~shuffle(sector)*sex + educ + exper, data=CPS85)
densityplot(~r.squared, data=s)
tally( ~r.squared >= real, data=s )

But this will work

tally( ~r.squared >= .311, data=s )
rpruim commented 11 years ago

I think this is taken care of using parent.frame(2) instead of parent.frame() because S4 methods result in nested functions that make the parent environment one step farther away....

> mod = lm( wage ~ sector*sex+ educ + exper, data=CPS85 )
> real = r.squared(mod)
> real
[1] 0.3113079
> s = do(1000)*lm(wage~shuffle(sector)*sex + educ + exper, data=CPS85)
> densityplot(~r.squared, data=s)
> tally( ~r.squared >= real, data=s )

 TRUE FALSE Total 
    0  1000  1000 
nicholasjhorton commented 5 years ago

I would still argue that there's a bug here somewhere:

library(mosaic) x <- c(1, 1, 1) # three of a kind countmatches <- function() { x <- c(1, 1, 2) # pair plus a singleton return(max(tally(~ x))) } countmatches() # should be 2, actually returns 3 sessionInfo() R version 3.5.1 (2018-07-02) Platform: x86_64-apple-darwin15.6.0 (64-bit) Running under: macOS Sierra 10.12.6

Matrix products: default BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib LAPACK: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib

locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages: [1] stats graphics grDevices utils datasets methods base

other attached packages: [1] mosaic_1.4.0 Matrix_1.2-14 mosaicData_0.17.0 ggformula_0.9.0 ggstance_0.3.1 ggplot2_3.0.0 lattice_0.20-35
[8] dplyr_0.7.6

loaded via a namespace (and not attached): [1] Rcpp_0.12.18 pillar_1.3.0 compiler_3.5.1 plyr_1.8.4 bindr_0.1.1 tools_3.5.1 nlme_3.1-137
[8] tibble_1.4.2 gtable_0.2.0 pkgconfig_2.0.2 rlang_0.2.2 rstudioapi_0.7 ggrepel_0.8.0 ggdendro_0.1-20 [15] bindrcpp_0.2.2 gridExtra_2.3 withr_2.1.2 stringr_1.3.1 grid_3.5.1 tidyselect_0.2.4 mosaicCore_0.6.0 [22] glue_1.3.0 R6_2.2.2 purrr_0.2.5 tidyr_0.8.1 magrittr_1.5 backports_1.1.2 scales_1.0.0
[29] MASS_7.3-50 splines_3.5.1 assertthat_0.2.0 colorspace_1.3-2 stringi_1.2.4 lazyeval_0.2.1 munsell_0.5.0
[36] broom_0.5.0 crayon_1.3.4

rpruim commented 5 years ago

Wow, this has been lying dormant awhile.

This isn't really a bug -- it is doing what it is documented to do. But perhaps we want to do it differently.

When data is not provided, we could use the environment of the formula. That can still cause problems (if the formula is saved in one environment and used in another) but it provides a way to get a reasonable default and to override it. Or maybe we can leverage something from rlang (which didn't exist when this issue was opened).

Have to give this some more thought.

library(mosaic)
x <- c(1, 1, 1) # three of a kind
countmatches <- function() {
  x <- c(1, 1, 2) # pair plus a singleton
  return(
    list(
      as.list(environment(~x))$x,
      as.list(parent.frame())$x
    )
  )
  # never get here.
  return(max(tally(~ x, data = parent.frame())))
}
countmatches() 
#> [[1]]
#> [1] 1 1 2
#> 
#> [[2]]
#> [1] 1 1 1

Created on 2018-09-17 by the reprex package (v0.2.0).

rpruim commented 5 years ago

I'd also have to look to see if we are making any use of S4 classes here. I have converted some old S4 stuff to S3 when that was possible.

rpruim commented 5 years ago

Hmm. It looks like there may be a bug of a different sort. It looks like this has been converted from S4 to S3, but in a couple places parent.frame(2) didn't get converted back to parent.frame(). Making that change seems to fix @nicholasjhorton 's immediate problem. The larger question of the best default value for data remains.

library(mosaic)
x <- c(1, 1, 1) # three of a kind
countmatches <- function() {
  x <- c(1, 1, 2) # pair plus a singleton
  return(max(tally(~ x)))
}
countmatches() 
#> [1] 2

Created on 2018-09-17 by the reprex package (v0.2.0).

rpruim commented 5 years ago

Note: this function is now in mosaicCore, not mosaic.