haleyjeppson / ggmosaic

Mosaicplots in the ggplot2 framework
https://haleyjeppson.github.io/ggmosaic/
168 stars 21 forks source link

Categorial labels on y-axis #4

Closed andrewheiss closed 2 weeks ago

andrewheiss commented 7 years ago

I'm so glad you're revamping productplots!

In the original productplots::prodplot function, the y-axis scale was categorical, not continuous, with breaks for each of the categories:

prodplot(data=happy, ~ sex + marital)

image

With ggmosaic::geom_mosaic, the y-axis is now continuous, ranging from 0–1. The only way to distinguish between categories along the y axis is to use a fill aesthetic, since there are no y-axis labels:

ggplot(data=happy) + geom_mosaic(aes(x=product(sex, marital), fill=sex))

image

It would be nice to be able to use category labels on the y-axis in place of the continuous scale. Is there an option in geom_mosaic() that I'm missing to enable this, or is there some way of manipulating scale_y_discrete() to do this? Or is there a way to incorporate this feature from prodplot?

Thanks!

andrewheiss commented 7 years ago

Using scale_y_product solves this most of the way, but it requires that breaks and labels be manually defined:

ggplot(data=happy) + 
  geom_mosaic(aes(x=product(sex, marital), fill=sex)) + 
  scale_y_product(breaks=c(0.25, 0.75), labels=c("Male", "Female"))

image

Here I arbitrarily chose 0.25 and 0.75. Is there a way to automatically determine those breaks and labels?

krlmlr commented 7 years ago

@andrewheiss: Came here because I was curious about categorical y-axes too, because this is what I'd naively expect in a mosaic plot. However, your code doesn't work anymore on a fresh install of ggmosaic from CRAN.

@haleyjeppson: Where can I read more about the rationale for a continuous y-scale showing proportions only? Thanks!

andrewheiss commented 7 years ago

It works with the latest version installed with devtools::install_package("haleyjeppson/ggmosaic"). It seems that scale_y_product has been replaced with scale_y_productlist, which now handles all the breaking and labeling by itself:

library(tidyverse)
library(ggmosaic)

# Use scale_y_productlist() with default settings
ggplot(data=happy) +
  geom_mosaic(aes(x=product(sex, marital), fill=sex)) +
  scale_y_productlist()

# Add labels manually
ggplot(data=happy) +
  geom_mosaic(aes(x=product(sex, marital), fill=sex)) +
  scale_y_productlist(labels = c("Male", "Female"))

Session info ``` r devtools::session_info() #> Session info -------------------------------------------------------------- #> setting value #> version R version 3.4.0 (2017-04-21) #> system x86_64, linux-gnu #> ui X11 #> language (EN) #> collate en_US.UTF-8 #> tz Etc/UTC #> date 2017-06-01 #> Packages ------------------------------------------------------------------ #> package * version date #> assertthat 0.2.0 2017-04-11 #> backports 1.0.5 2017-01-18 #> bitops 1.0-6 2013-08-17 #> broom 0.4.2 2017-02-13 #> cellranger 1.1.0 2016-07-27 #> colorspace 1.3-2 2016-12-14 #> data.table 1.10.4 2017-02-01 #> DBI 0.6-1 2017-04-01 #> devtools 1.12.0 2016-12-05 #> digest 0.6.12 2017-01-27 #> dplyr * 0.5.0 2016-06-24 #> evaluate 0.10 2016-10-11 #> forcats 0.2.0 2017-01-23 #> foreign 0.8-68 2017-04-24 #> ggmosaic * 0.1.2.9000 2017-06-01 #> ggplot2 * 2.2.1.9000 2017-06-01 #> gtable 0.2.0 2016-02-26 #> haven 1.0.0 2016-09-23 #> hms 0.3 2016-11-22 #> htmltools 0.3.5 2016-03-21 #> htmlwidgets 0.8 2016-11-09 #> httr 1.2.1 2016-07-03 #> jsonlite 1.4 2017-04-08 #> knitr 1.15.1 2016-11-22 #> lattice 0.20-35 2017-03-25 #> lazyeval 0.2.0 2016-06-12 #> lubridate 1.6.0 2016-09-13 #> magrittr 1.5 2014-11-22 #> memoise 1.1.0 2017-04-21 #> mnormt 1.5-5 2016-10-15 #> modelr 0.1.0 2016-08-31 #> munsell 0.4.3 2016-02-13 #> nlme 3.1-131 2017-02-06 #> plotly 4.7.0 2017-05-28 #> plyr 1.8.4 2016-06-08 #> productplots * 0.1.1 2016-07-02 #> psych 1.7.3.21 2017-03-22 #> purrr * 0.2.2.2 2017-05-11 #> R6 2.2.0 2016-10-05 #> Rcpp 0.12.11 2017-05-22 #> RCurl 1.95-4.8 2016-03-01 #> readr * 1.1.0 2017-03-22 #> readxl 1.0.0 2017-04-18 #> reshape2 1.4.2 2016-10-22 #> rlang 0.1.1 2017-05-18 #> rmarkdown 1.5 2017-04-26 #> rprojroot 1.2 2017-01-16 #> rvest 0.3.2 2016-06-17 #> scales 0.4.1 2016-11-09 #> stringi 1.1.5 2017-04-07 #> stringr 1.2.0 2017-02-18 #> tibble * 1.3.3 2017-05-28 #> tidyr * 0.6.3 2017-05-15 #> tidyverse * 1.1.1 2017-01-27 #> viridisLite 0.2.0 2017-03-24 #> withr 1.0.2 2016-06-20 #> XML 3.98-1.7 2017-05-03 #> xml2 1.1.1 2017-01-24 #> yaml 2.1.14 2016-11-12 #> source #> CRAN (R 3.4.0) #> CRAN (R 3.4.0) #> CRAN (R 3.4.0) #> CRAN (R 3.4.0) #> CRAN (R 3.4.0) #> CRAN (R 3.4.0) #> CRAN (R 3.4.0) #> CRAN (R 3.4.0) #> CRAN (R 3.4.0) #> CRAN (R 3.4.0) #> CRAN (R 3.4.0) #> CRAN (R 3.4.0) #> CRAN (R 3.4.0) #> CRAN (R 3.4.0) #> Github (haleyjeppson/ggmosaic@cf32577) #> Github (tidyverse/ggplot2@eedaa81) #> CRAN (R 3.4.0) #> CRAN (R 3.4.0) #> CRAN (R 3.4.0) #> CRAN (R 3.4.0) #> CRAN (R 3.4.0) #> CRAN (R 3.4.0) #> CRAN (R 3.4.0) #> CRAN (R 3.4.0) #> CRAN (R 3.4.0) #> CRAN (R 3.4.0) #> CRAN (R 3.4.0) #> CRAN (R 3.4.0) #> CRAN (R 3.4.0) #> CRAN (R 3.4.0) #> CRAN (R 3.4.0) #> CRAN (R 3.4.0) #> CRAN (R 3.4.0) #> CRAN (R 3.4.0) #> CRAN (R 3.4.0) #> CRAN (R 3.4.0) #> CRAN (R 3.4.0) #> cran (@0.2.2.2) #> CRAN (R 3.4.0) #> cran (@0.12.11) #> CRAN (R 3.4.0) #> CRAN (R 3.4.0) #> CRAN (R 3.4.0) #> CRAN (R 3.4.0) #> cran (@0.1.1) #> CRAN (R 3.4.0) #> CRAN (R 3.4.0) #> CRAN (R 3.4.0) #> CRAN (R 3.4.0) #> CRAN (R 3.4.0) #> CRAN (R 3.4.0) #> cran (@1.3.3) #> cran (@0.6.3) #> CRAN (R 3.4.0) #> CRAN (R 3.4.0) #> CRAN (R 3.4.0) #> CRAN (R 3.4.0) #> CRAN (R 3.4.0) #> CRAN (R 3.4.0) ```
EA-Williams commented 7 years ago

Thanks @andrewheiss , just what I was looking for.

Is there a way to reverse the order of the categories on the y axis, i.e. male then female? Changing the order in the data file didn't work and scale_y_productlist(trans = "reverse") is not supported.

EA-Williams commented 7 years ago

Answer to my question:

I renamed the categories of my y-variable to be numerical, with the category I wanted at the top as the highest number. In the example above, if I wanted 'male' to be above 'female', I would replace 'male' with '2' and 'female' with '1' (or create a dummy variable with these codes in).

andrewheiss commented 7 years ago

The best way to handle category ordering is to use an ordered factor. As with ggplot's other geoms, if a categorical variable is not explicitly ordered, the order is determined alphabetically. ggmosaic also seems to plot category orders in reverse, so you have to reverse the factor levels too.

library(tidyverse)
library(ggmosaic)

happy <- happy %>%
  mutate(sex = factor(sex, levels=c("female", "male"), ordered=TRUE))

# Verify ordering
str(happy$sex)
#>  Ord.factor w/ 2 levels "female"<"male": 1 2 1 1 1 2 2 2 1 1 ...

# Use scale_y_productlist() with default settings
ggplot(data=happy) +
  geom_mosaic(aes(x=product(sex, marital), fill=sex)) +
  scale_y_productlist()

download

EA-Williams commented 7 years ago

I think we arrived at the same conclusion at the same time!

Do you know if there's a way to emulate expand = c(0, 0), to remove the padding between the mosaic and the labels?

I currently have this:

image

andrewheiss commented 7 years ago

I've been tinkering with this and I couldn't find a way. geom_mosaic doesn't seem to do anything with expand()

EA-Williams commented 7 years ago

Thanks for your tinkering. That's a shame! I might have to use the old mosaic() function after all. Should I add this as a new issue?

econandrew commented 6 years ago

I added this issue and included a workaround using coord_cartesian in #16.

heike commented 2 weeks ago

See table for compatibility of ggplot2 and ggmosaic warnings in README