tidyverse / ggplot2

An implementation of the Grammar of Graphics in R
https://ggplot2.tidyverse.org
Other
6.5k stars 2.02k forks source link

Strange behavior of "group" aesthetic in geom_dotplot #3839

Closed gael-millot closed 4 years ago

gael-millot commented 4 years ago

Is this behavior expected?

# conversion of the vs and am columns of mtcars data frame into factors
mtcars$vs <- factor(mtcars$vs, labels = c("A", "B"))
mtcars$am <- factor(mtcars$am, labels = c("G", "H"))

# test of geom_dotplot using vs, mpg and am columns of mtcars
ggplot2::ggplot()+ggplot2::geom_dotplot(
    data = mtcars, 
    mapping = ggplot2::aes(x = vs, y = mpg, group = am), 
    position = ggplot2::position_dodge(), 
    binaxis = "y", 
    stackdir = "center",
    show.legend = TRUE
)

1

Because other aesthetics gives the appropriate result, for instance when group is replaced by fill:

ggplot2::ggplot()+ggplot2::geom_dotplot(
    data = mtcars, 
    mapping = ggplot2::aes(x = vs, y = mpg, fill= am), 
    position = ggplot2::position_dodge(), 
    binaxis = "y", 
    stackdir = "center",
    show.legend = TRUE
)

2

The legend does not appear with group aesthetic, as expected, but the dot positions seem totally incoherent. It would be great if I could use group with the correct dot positioning.

Thanks for your help !

Best.

gael-millot commented 4 years ago

Of note, group works fine when replacing the am category by vs (a single qualitative variable plotted instead of two):

ggplot2::ggplot()+ggplot2::geom_dotplot(
    data = mtcars, 
    mapping = ggplot2::aes(x = vs, y = mpg, group = vs), 
    position = ggplot2::position_dodge(), 
    binaxis = "y", 
    stackdir = "center", 
    show.legend = TRUE
)

3

As if, with two categories, group takes the "average" position of A and B and then separates correctly between G and H...

yutannihilation commented 4 years ago

Is this behavior expected?

Yes, this is an expected behaviour. My previous comment might be helpful to understand how the grouping works.

https://github.com/tidyverse/ggplot2/issues/3486#issuecomment-522210017

gael-millot commented 4 years ago

Many thanks for the answer ! It indeed indicates that the best is to create a new categorical colum:

# nothing changed in the code, compared to the 1st example, except the group aesthetic
ggplot2::ggplot()+ggplot2::geom_dotplot(
    data = mtcars, 
    mapping = ggplot2::aes(x = vs, y = mpg, group = paste(vs, am, sep = ".")), 
    position = ggplot2::position_dodge(), 
    binaxis = "y", 
    stackdir = "center",
    show.legend = TRUE
)

4

It is also interesting to mention that group presents different behaviors, depending on the function used:

# test of geom_point using vs, mpg and am columns of mtcars
ggplot2::ggplot()+ggplot2::geom_point(
    data = mtcars, 
    mapping = ggplot2::aes(x = vs, y = mpg, group = am),
)

1

# test of geom_dotplot in the same situation
ggplot2::ggplot()+ggplot2::geom_dotplot(
    data = mtcars, 
    mapping = ggplot2::aes(x = vs, y = mpg, group = am),
    binaxis = "y"
)

2

Thus, it could be worth to make a special section mentioning some particularities of group in the ggplot2 documentation... here for instance ? https://ggplot2.tidyverse.org/reference/aes_group_order.html

Thanks !

yutannihilation commented 4 years ago

Ah, it might be a bit clearer if aes_group_order doc shows an example case when both group and other discrete variables are mapped. Feel free to contribute :)