tidyverse / ggplot2

An implementation of the Grammar of Graphics in R
https://ggplot2.tidyverse.org
Other
6.39k stars 2k forks source link

Adhere more strictly to `bins` argument in `stat_bin()` #5891

Closed teunbrand closed 1 month ago

teunbrand commented 1 month ago

This PR aims to fix #5882, fix #5036 and fix #5890.

Briefly, the default boundary was sometimes imputed such that bins was not respected. This PR fixes that by setting the default boundary half a bin below the minimum.

The reason why this value was chosen is utilitarian rather than having good theoretical justification. It only affects the arm of the binning code that deals with unspecified binwidth or breaks arguments. As #5890 had adjacent code, I decided to fix that here as well.

The relevant reprexes:

devtools::load_all("~/packages/ggplot2")
#> ℹ Loading ggplot2

# From #5882 (had 2 bins)
p <- ggplot(palmerpenguins::penguins, aes(x= body_mass_g)) +
  geom_histogram(bins = 3)
nrow(layer_data(p))
#> Warning: Removed 2 rows containing non-finite outside the scale range
#> (`stat_bin()`).
#> [1] 3

# From #5036 (had 10 bins)
p <- ggplot(ggplot2movies::movies, aes(rating)) + 
  geom_histogram(bins = 10)
nrow(layer_data(p))
#> [1] 10

# From #5036 (had 9 bins)
p <- ggplot(ggplot2movies::movies, aes(rating)) + 
  geom_histogram(bins = 10, boundary = 0)
nrow(layer_data(p))
#> [1] 10

# From #5890
ggplot(mpg, aes(displ)) +
  geom_histogram(bins = 1, center = 0)

Created on 2024-05-13 with reprex v2.1.0