Mosaic plots are very useful in statistics to illustrate results of khi2 tests (show intensity of correlation between two modalities thanks to the value of residuals).
Here is for example with base R mosaicplot function :
library(tidyverse)
library(ggmosaic)
mosaicplot(happy$marital~ happy$sex, shade = TRUE, main = "Title")
I tried to do the same with ggmosaic but I cannot find a way to do it :(
The main issue is that I cannot find a way to use the field "fill" with other variables than the x or y value without obtaining strange outputs... Below is an example with the "var_fill" categorical variable.
library(tidyverse)
library(ggmosaic)
happy2 <- happy %>%
group_by(sex,marital) %>%
summarise(count = n()) %>%
ungroup() %>%
mutate(var_fill= LETTERS[sample.int(3, 12, replace = TRUE)])
> head(happy2)
# A tibble: 6 x 4
sex marital count var_fill
<fctr> <fctr> <int> <chr>
1 male married 13343 B
2 male never married 5204 A
3 male divorced 2354 C
4 male widowed 904 A
5 male separated 627 B
6 male <NA> 7 C
ggplot(data=happy2) +
geom_mosaic(aes(weight=count, x=product(sex, marital), fill=sex)) +
scale_y_productlist()
Strange output when using another "fill" variable (many colors in each box instead of lines with small lines)
It would be great if someone could help me to find out how I could modify the "fill" legend so that I can use another variable related to the importance of the residuals. Or even better so that I can plot for each box (var_x * var_y) the values of the residuals or the khi2 test (numeric variable which could be modify with scale_fill_continuous).
Hello,
Thank you for this ggplot2 extension !
Mosaic plots are very useful in statistics to illustrate results of khi2 tests (show intensity of correlation between two modalities thanks to the value of residuals).
Here is for example with base R mosaicplot function :
I tried to do the same with ggmosaic but I cannot find a way to do it :(
The main issue is that I cannot find a way to use the field "fill" with other variables than the x or y value without obtaining strange outputs... Below is an example with the "var_fill" categorical variable.
Strange output when using another "fill" variable (many colors in each box instead of lines with small lines)
It would be great if someone could help me to find out how I could modify the "fill" legend so that I can use another variable related to the importance of the residuals. Or even better so that I can plot for each box (var_x * var_y) the values of the residuals or the khi2 test (numeric variable which could be modify with scale_fill_continuous).
Thank you for your help !