tidyverse / ggplot2

An implementation of the Grammar of Graphics in R
https://ggplot2.tidyverse.org
Other
6.39k stars 2k forks source link

Documentation: `stat_bin(binwidth)` contains mistake #5960

Open teunbrand opened 1 week ago

teunbrand commented 1 week ago

The following is a piece of documentation of the stat_bin(binwidth) argument:

Can be specified as a numeric value or as a function that calculates width from unscaled x. Here, "unscaled x" refers to the original x values in the data, before application of any scale transformation.

A quick test reveals that a binwidth function takes transformed values as input, not 'unscaled x' as the documentation suggests. Note that min(diamonds$price) == 326, so evidently the values are log-transformed.

library(ggplot2)

set.seed(42)
show <- function(x) {
  print(x[sample(length(x), pmin(length(x), 6))])
}

ggplot(diamonds, aes(price)) +
  geom_histogram(binwidth = function(x) {show(x); 0.1}) +
  scale_x_log10()

#> [1] 2.928908 3.651084 3.243038 3.262214 2.786751 3.631038

Created on 2024-06-27 with reprex v2.1.0

I think the confusion stems from #2828. I propose this alternative:

Can be specified as a numeric value or as a function that takes x after scale transformation as input and returns a single numeric value.