ProjectMOSAIC / ggformula

Provides a formula interface to 'ggplot2' graphics.
Other
39 stars 11 forks source link

gf_step(..., stat="ecdf") requires a dummy value on LHS of formula #127

Closed jonjhitchcock closed 5 years ago

jonjhitchcock commented 5 years ago

The following script does what I want it to, but as a workaround I have included a dummy value (999) on the left-hand side of the formula. Without the dummy value, it fails with "Error: Invalid formula type for gf_step." Two questions: Can a different type of formula be used? Is there a plan to create a function gf_ecdf based on stat_ecdf? [My apologies if this is not the correct way to raise an issue]

# Originally based on Schmuller (2017) pages 136 and 137 
library(MASS)
library(mosaic)
price.q <- quantile( ~ Price, data=Cars93)
gf_step(999 ~ Price, data=Cars93, stat="ecdf") %>%
  gf_labs(x="Price / $1,000", y="Cumulative Proportion") %>%
  gf_hline(yintercept=c(0, 1), linetype="dashed") %>%
  gf_vline(xintercept=price.q, linetype="dashed") %>%
  gf_refine(scale_x_continuous(breaks=price.q, labels=price.q))
nicholasjhorton commented 5 years ago

@rpruim just FYI when I run this code I get an Error:

Error: Please use data_frame() or new_data_frame() instead of data.frame() for better performance. See the vignette "ggplot2 internal programming guidelines" for details.

sessionInfo() R version 3.6.0 (2019-04-26) Platform: x86_64-apple-darwin15.6.0 (64-bit) Running under: macOS Mojave 10.14.5

Matrix products: default BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib LAPACK: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRlapack.dylib

locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages: [1] stats graphics grDevices utils datasets methods base

other attached packages: [1] MASS_7.3-51.4 RCurl_1.95-4.12 bitops_1.0-6 shiny_1.3.2 mosaic_1.5.0.9001 [6] Matrix_1.2-17 mosaicData_0.17.0 ggformula_0.9.1 ggstance_0.3.1 ggplot2_3.2.0
[11] lattice_0.20-38 dplyr_0.8.1 mdsr_0.1.7.9000

loaded via a namespace (and not attached): [1] tidyselect_0.2.5 purrr_0.3.2 splines_3.6.0 sourcetools_0.1.7 vctrs_0.1.0
[6] colorspace_1.4-1 generics_0.0.2 htmltools_0.3.6 yaml_2.2.0 utf8_1.1.4
[11] rlang_0.3.4 pillar_1.4.1 later_0.8.0 glue_1.3.1.9000 withr_2.1.2
[16] mosaicCore_0.6.0 stringr_1.4.0 munsell_0.5.0 gtable_0.3.0 htmlwidgets_1.3
[21] httpuv_1.5.1 crosstalk_1.0.0 fansi_0.4.0 babynames_1.0.0 broom_0.5.2
[26] Rcpp_1.0.1 xtable_1.8-4 readr_1.3.1 scales_1.0.0 backports_1.1.4
[31] promises_1.0.1 jsonlite_1.6 leaflet_2.0.2 mime_0.7 gridExtra_2.3
[36] hms_0.4.2 digest_0.6.19 stringi_1.4.3 ggrepel_0.8.1 grid_3.6.0
[41] cli_1.1.0 tools_3.6.0 magrittr_1.5 lazyeval_0.2.2 tibble_2.1.3
[46] ggdendro_0.1-20 zeallot_0.1.0 crayon_1.3.4 tidyr_0.8.3 pkgconfig_2.0.2
[51] rsconnect_0.8.13 assertthat_0.2.1 rstudioapi_0.10 R6_2.4.0 nlme_3.1-140
[56] compiler_3.6.0

rpruim commented 5 years ago

@nicholasjhorton, the error is not directly related to this issue (and has already been fixed in the github version). See https://github.com/tidyverse/ggplot2/issues/3377 for the cause.

rpruim commented 5 years ago

@jonjhitchcock I'll have to give this a bit more thought. It does not typically make sense to use gf_step() without y. Your example only works because the stat computes a y vector for you.

I'll probably do a 2-step solution:

1) It is easy to create gf_ecdf(). That also avoids needing to call the stat. 2) More interesting is to see if there is a way to populate parts of what the formula describes after the stat has done its job. If that is possible, then your example would work without the placeholder in the formula.

rpruim commented 5 years ago

I've added gf_ecdf() and gf_ellipse().

suppressPackageStartupMessages(library(ggformula))
theme_set(theme_bw())
example(gf_ecdf, ask = FALSE)
#> 
#> gf_cdf> Data <- data.frame(
#> gf_cdf+   x = c(rnorm(100, 0, 1), rnorm(100, 0, 3), rt(100, df = 3)),
#> gf_cdf+   g = gl(3, 100, labels = c("N(0, 1)", "N(0, 3)", "T(df = 3)") )
#> gf_cdf+ )
#> 
#> gf_cdf> gf_ecdf( ~ x, data = Data)

    #> 
    #> gf_cdf> # Don't go to positive/negative infinity
    #> gf_cdf> gf_ecdf( ~ x, data = Data, pad = FALSE)

    #> 
    #> gf_cdf> # Multiple ECDFs
    #> gf_cdf> gf_ecdf( ~ x, data = Data, color = ~ g)

example(gf_ellipse, ask = FALSE)
    #> 
    #> gf_llp> gf_ellipse()
    #> gf_ellipse() uses 
    #>     * a formula with shape y ~ x. 
    #>     * geom:  path 
    #>     * stat:  ellipse 
    #>     * key attributes:  alpha, color, group, type = "t", level = 0.95,
    #>                    segments = 51
    #> 
    #> For more information, try ?gf_ellipse
    #> 
    #> gf_llp> gf_point(eruptions ~ waiting, data = faithful) %>%
    #> gf_llp+   gf_ellipse(alpha = 0.5)

    #> 
    #> gf_llp> gf_point(eruptions ~ waiting, data = faithful, color = ~ (eruptions > 3)) %>%
    #> gf_llp+   gf_ellipse(alpha = 0.5)

    #> 
    #> gf_llp> gf_point(eruptions ~ waiting, data = faithful, color = ~ (eruptions > 3)) %>%
    #> gf_llp+   gf_ellipse(type = "norm", linetype = ~ "norm") %>%
    #> gf_llp+   gf_ellipse(type = "t",    linetype = ~ "t")

    #> 
    #> gf_llp> gf_point(eruptions ~ waiting, data = faithful, color = ~ (eruptions > 3)) %>%
    #> gf_llp+   gf_ellipse(type = "norm",   linetype = ~ "norm") %>%
    #> gf_llp+   gf_ellipse(type = "euclid", linetype = ~ "euclid", level = 3) %>%
    #> gf_llp+   gf_refine(coord_fixed())

    #> 
    #> gf_llp> # Use geom = "polygon" to enable fill
    #> gf_llp> gf_point(eruptions ~ waiting, data = faithful, fill = ~ (eruptions > 3)) %>%
    #> gf_llp+   gf_ellipse(geom = "polygon", alpha = 0.3, color = "black")

    #> 
    #> gf_llp> gf_point(eruptions ~ waiting, data = faithful, fill = ~ (eruptions > 3)) %>%
    #> gf_llp+   gf_ellipse(geom = "polygon", alpha = 0.3) %>%
    #> gf_llp+   gf_ellipse(alpha = 0.3, color = "black")

    #> 
    #> gf_llp> gf_ellipse(eruptions ~ waiting, data = faithful, show.legend = FALSE,
    #> gf_llp+   alpha = 0.3, fill = ~ (eruptions > 3), geom = "polygon") %>%
    #> gf_llp+   gf_ellipse(level = 0.68, geom = "polygon", alpha = 0.3) %>%
    #> gf_llp+   gf_point(data = faithful, color = ~ (eruptions > 3), show.legend = FALSE)

Created on 2019-06-25 by the reprex package (v0.3.0)