haleyjeppson / ggmosaic

Mosaicplots in the ggplot2 framework
168 stars 21 forks source link

Categorial labels on y-axis #4

Closed andrewheiss closed 2 weeks ago

andrewheiss commented 7 years ago

I'm so glad you're revamping productplots!

In the original productplots::prodplot function, the y-axis scale was categorical, not continuous, with breaks for each of the categories:

prodplot(data=happy, ~ sex + marital)


With ggmosaic::geom_mosaic, the y-axis is now continuous, ranging from 0–1. The only way to distinguish between categories along the y axis is to use a fill aesthetic, since there are no y-axis labels:

ggplot(data=happy) + geom_mosaic(aes(x=product(sex, marital), fill=sex))


It would be nice to be able to use category labels on the y-axis in place of the continuous scale. Is there an option in geom_mosaic() that I'm missing to enable this, or is there some way of manipulating scale_y_discrete() to do this? Or is there a way to incorporate this feature from prodplot?


andrewheiss commented 7 years ago

Using scale_y_product solves this most of the way, but it requires that breaks and labels be manually defined:

ggplot(data=happy) + 
  geom_mosaic(aes(x=product(sex, marital), fill=sex)) + 
  scale_y_product(breaks=c(0.25, 0.75), labels=c("Male", "Female"))


Here I arbitrarily chose 0.25 and 0.75. Is there a way to automatically determine those breaks and labels?

krlmlr commented 7 years ago

@andrewheiss: Came here because I was curious about categorical y-axes too, because this is what I'd naively expect in a mosaic plot. However, your code doesn't work anymore on a fresh install of ggmosaic from CRAN.

@haleyjeppson: Where can I read more about the rationale for a continuous y-scale showing proportions only? Thanks!

andrewheiss commented 7 years ago

It works with the latest version installed with devtools::install_package("haleyjeppson/ggmosaic"). It seems that scale_y_product has been replaced with scale_y_productlist, which now handles all the breaking and labeling by itself:


# Use scale_y_productlist() with default settings
ggplot(data=happy) +
  geom_mosaic(aes(x=product(sex, marital), fill=sex)) +

# Add labels manually
ggplot(data=happy) +
  geom_mosaic(aes(x=product(sex, marital), fill=sex)) +
  scale_y_productlist(labels = c("Male", "Female"))

Session info ``` r devtools::session_info() #> Session info -------------------------------------------------------------- #> setting value #> version R version 3.4.0 (2017-04-21) #> system x86_64, linux-gnu #> ui X11 #> language (EN) #> collate en_US.UTF-8 #> tz Etc/UTC #> date 2017-06-01 #> Packages ------------------------------------------------------------------ #> package * version date #> assertthat 0.2.0 2017-04-11 #> backports 1.0.5 2017-01-18 #> bitops 1.0-6 2013-08-17 #> broom 0.4.2 2017-02-13 #> cellranger 1.1.0 2016-07-27 #> colorspace 1.3-2 2016-12-14 #> data.table 1.10.4 2017-02-01 #> DBI 0.6-1 2017-04-01 #> devtools 1.12.0 2016-12-05 #> digest 0.6.12 2017-01-27 #> dplyr * 0.5.0 2016-06-24 #> evaluate 0.10 2016-10-11 #> forcats 0.2.0 2017-01-23 #> foreign 0.8-68 2017-04-24 #> ggmosaic * 2017-06-01 #> ggplot2 * 2017-06-01 #> gtable 0.2.0 2016-02-26 #> haven 1.0.0 2016-09-23 #> hms 0.3 2016-11-22 #> htmltools 0.3.5 2016-03-21 #> htmlwidgets 0.8 2016-11-09 #> httr 1.2.1 2016-07-03 #> jsonlite 1.4 2017-04-08 #> knitr 1.15.1 2016-11-22 #> lattice 0.20-35 2017-03-25 #> lazyeval 0.2.0 2016-06-12 #> lubridate 1.6.0 2016-09-13 #> magrittr 1.5 2014-11-22 #> memoise 1.1.0 2017-04-21 #> mnormt 1.5-5 2016-10-15 #> modelr 0.1.0 2016-08-31 #> munsell 0.4.3 2016-02-13 #> nlme 3.1-131 2017-02-06 #> plotly 4.7.0 2017-05-28 #> plyr 1.8.4 2016-06-08 #> productplots * 0.1.1 2016-07-02 #> psych 2017-03-22 #> purrr * 2017-05-11 #> R6 2.2.0 2016-10-05 #> Rcpp 0.12.11 2017-05-22 #> RCurl 1.95-4.8 2016-03-01 #> readr * 1.1.0 2017-03-22 #> readxl 1.0.0 2017-04-18 #> reshape2 1.4.2 2016-10-22 #> rlang 0.1.1 2017-05-18 #> rmarkdown 1.5 2017-04-26 #> rprojroot 1.2 2017-01-16 #> rvest 0.3.2 2016-06-17 #> scales 0.4.1 2016-11-09 #> stringi 1.1.5 2017-04-07 #> stringr 1.2.0 2017-02-18 #> tibble * 1.3.3 2017-05-28 #> tidyr * 0.6.3 2017-05-15 #> tidyverse * 1.1.1 2017-01-27 #> viridisLite 0.2.0 2017-03-24 #> withr 1.0.2 2016-06-20 #> XML 3.98-1.7 2017-05-03 #> xml2 1.1.1 2017-01-24 #> yaml 2.1.14 2016-11-12 #> source #> CRAN (R 3.4.0) #> CRAN (R 3.4.0) #> CRAN (R 3.4.0) #> CRAN (R 3.4.0) #> CRAN (R 3.4.0) #> CRAN (R 3.4.0) #> CRAN (R 3.4.0) #> CRAN (R 3.4.0) #> CRAN (R 3.4.0) #> CRAN (R 3.4.0) #> CRAN (R 3.4.0) #> CRAN (R 3.4.0) #> CRAN (R 3.4.0) #> CRAN (R 3.4.0) #> Github (haleyjeppson/ggmosaic@cf32577) #> Github (tidyverse/ggplot2@eedaa81) #> CRAN (R 3.4.0) #> CRAN (R 3.4.0) #> CRAN (R 3.4.0) #> CRAN (R 3.4.0) #> CRAN (R 3.4.0) #> CRAN (R 3.4.0) #> CRAN (R 3.4.0) #> CRAN (R 3.4.0) #> CRAN (R 3.4.0) #> CRAN (R 3.4.0) #> CRAN (R 3.4.0) #> CRAN (R 3.4.0) #> CRAN (R 3.4.0) #> CRAN (R 3.4.0) #> CRAN (R 3.4.0) #> CRAN (R 3.4.0) #> CRAN (R 3.4.0) #> CRAN (R 3.4.0) #> CRAN (R 3.4.0) #> CRAN (R 3.4.0) #> CRAN (R 3.4.0) #> cran (@ #> CRAN (R 3.4.0) #> cran (@0.12.11) #> CRAN (R 3.4.0) #> CRAN (R 3.4.0) #> CRAN (R 3.4.0) #> CRAN (R 3.4.0) #> cran (@0.1.1) #> CRAN (R 3.4.0) #> CRAN (R 3.4.0) #> CRAN (R 3.4.0) #> CRAN (R 3.4.0) #> CRAN (R 3.4.0) #> CRAN (R 3.4.0) #> cran (@1.3.3) #> cran (@0.6.3) #> CRAN (R 3.4.0) #> CRAN (R 3.4.0) #> CRAN (R 3.4.0) #> CRAN (R 3.4.0) #> CRAN (R 3.4.0) #> CRAN (R 3.4.0) ```
EA-Williams commented 7 years ago

Thanks @andrewheiss , just what I was looking for.

Is there a way to reverse the order of the categories on the y axis, i.e. male then female? Changing the order in the data file didn't work and scale_y_productlist(trans = "reverse") is not supported.

EA-Williams commented 7 years ago

Answer to my question:

I renamed the categories of my y-variable to be numerical, with the category I wanted at the top as the highest number. In the example above, if I wanted 'male' to be above 'female', I would replace 'male' with '2' and 'female' with '1' (or create a dummy variable with these codes in).

andrewheiss commented 7 years ago

The best way to handle category ordering is to use an ordered factor. As with ggplot's other geoms, if a categorical variable is not explicitly ordered, the order is determined alphabetically. ggmosaic also seems to plot category orders in reverse, so you have to reverse the factor levels too.


happy <- happy %>%
  mutate(sex = factor(sex, levels=c("female", "male"), ordered=TRUE))

# Verify ordering
#>  Ord.factor w/ 2 levels "female"<"male": 1 2 1 1 1 2 2 2 1 1 ...

# Use scale_y_productlist() with default settings
ggplot(data=happy) +
  geom_mosaic(aes(x=product(sex, marital), fill=sex)) +


EA-Williams commented 7 years ago

I think we arrived at the same conclusion at the same time!

Do you know if there's a way to emulate expand = c(0, 0), to remove the padding between the mosaic and the labels?

I currently have this:


andrewheiss commented 7 years ago

I've been tinkering with this and I couldn't find a way. geom_mosaic doesn't seem to do anything with expand()

EA-Williams commented 7 years ago

Thanks for your tinkering. That's a shame! I might have to use the old mosaic() function after all. Should I add this as a new issue?

econandrew commented 6 years ago

I added this issue and included a workaround using coord_cartesian in #16.

heike commented 2 weeks ago

See table for compatibility of ggplot2 and ggmosaic warnings in README