wilkelab / ggridges

Ridgeline plots in ggplot2
https://wilkelab.org/ggridges
GNU General Public License v2.0
412 stars 31 forks source link

weights and range calculation #59

Closed mkoohafkan closed 7 months ago

mkoohafkan commented 4 years ago

This gets the job done, but I wonder if it would be more effective to turn geom_density_ridges into a geom_ridgeline() + ggplot2::stat_density() combo.

clauswilke commented 4 years ago

A couple of comments:

  1. Please don't change the bandwidth calculation if weight = NULL, and please don't call stats::density() just to calculate bandwidth. Actually, I think stats::density() doesn't consider weights for bandwidth calculation (see below), so you can just leave it as is.

  2. Please use tidyverse styling and indenting.

  3. I don't see the need to create a data$weigth column if the weight aesthetic isn't set. You can just write something like the following before the stats::density() call:

    if (is.null(data$weight)) {
    weights <- NULL
    } else {
    weights <- data$weight / sum(data$weight)
    }
  4. All new features will have to be documented. Also consider adding a unit test for unweighted and weighted density calculations (neither currently exists).

  5. Currently a few unit tests are failing. I believe this is unrelated to the current PR and will have to be addressed separately.

Demonstration that density() doesn't use the weights for bandwidth calculations. This can be verified by looking at the source code.

x <- rnorm(100)
weights <- 1:100

density(x)$bw
#> [1] 0.3789782
density(x, weights = weights/sum(weights))$bw
#> [1] 0.3789782

Created on 2020-06-21 by the reprex package (v0.3.0)

clauswilke commented 7 months ago

This is now resolved. #90