tidyverse / ggplot2

An implementation of the Grammar of Graphics in R
https://ggplot2.tidyverse.org
Other
6.47k stars 2.02k forks source link

Set `width` as aesthetic in `geom_col()`/`geom_bar()`. #5807

Closed teunbrand closed 4 months ago

teunbrand commented 5 months ago

This PR aims to fix #3142.

Motivation is given here: https://github.com/tidyverse/ggplot2/issues/3142#issuecomment-2019903367. To recap: yes variable width bars are bad, but we allowed them implicitly anyway and are jumping through hoops to clunkily discourage it. This PR removes the hack that recognises width as a parameter, and implements it as an aesthetic instead.

katossky commented 3 months ago

Not sure whether I should comment here or on the issue :

# devtools::install_github("https://github.com/tidyverse/ggplot2")
ggplot2::diamonds |>
    dplyr::summarise(n=dplyr::n(), avgPrice=mean(price), .by=clarity) |>
    ggplot(aes(x=clarity, width=n, y=avgPrice)) +
    geom_col(fill=NA, color='black')

produces :

image

I checked that I have the correct version of geom_bar and indeed :

> geom_bar
function (mapping = NULL, data = NULL, stat = "count", position = "stack", 
    ..., just = 0.5, na.rm = FALSE, orientation = NA, show.legend = NA, 
    inherit.aes = TRUE) ...

... does not have the width parameter anymore.

teunbrand commented 3 months ago

this feature seems to solve the width issue but not the height one

Can I ask what height issue exactly?

more importantly, it does not work as I expect

Running that code with the CRAN version or even older versions such as 3.4.4 yields the same plot, so I'm reasonably sure that this PR didn't affect that outcome.

katossky commented 3 months ago

height issue : geom_col does work with numeric x and discrete y ; it should maybe accept height in that case ?

katossky commented 3 months ago

For the other issue, I did not say that you break anything ! Just that the result does not fit my expectations.

If :

ggplot2::diamonds |>
    dplyr::summarise(n=dplyr::n(), avgPrice=mean(price), .by=clarity) |>
    ggplot(aes(x=clarity, y=avgPrice)) +
    geom_col(fill=NA, color='black')

... produces :

image

... then I do not expect the columns to get crumble together when I add a width aesthetic.

teunbrand commented 3 months ago

Just that the result does not fit my expectations

The n you compute ranges in the thousands, whereas the discrete x-axis places only takes 8. The labels crammed in the middle makes sense to me as the scale has to accommodate the thousands.

geom_col does work with numeric x and discrete y ; it should maybe accept height in that case ?

It does, and there is no need for height as width already controls the size of horizontal bars in the same way it does for vertical bars.

library(ggplot2)
ggplot(mtcars, aes(mpg, rownames(mtcars))) +
  geom_col(width = 0.5)

Created on 2024-05-23 with reprex v2.1.0

katossky commented 3 months ago

So it was my understanding all along, sorry for that.

width indeed works with numeric x and discrete y (thank you) but I wrongly expected height there in alignment with geom_tile.

I was not expecting either width (a priori something quantitative) to share the units with x (a priori discrete) ! But I guess it may makes sense for consistency ?

ggplot2::diamonds |>
    dplyr::summarise(n=dplyr::n(), avgPrice=mean(price), .by=clarity) |>
    ggplot(aes(x=clarity, y=avgPrice, width = n / max(n))) +
    geom_col(fill=NA, color='black')

... indeed yields :

image

Last thing, in the preivous graph, I expected equal column spacing, irrespective of the width, as the only meaningful case I have in mind for variable-width column charts is this :

image

... but maybe I was wrong here again and that is not what you or anyone had in mind.

Thank you for your patience and clarification. Should I forward these comments somewhere where they are more useful ? Maybe am I not the only confused one ?