AndriSignorell / DescTools

Tools for Descriptive Statistics and Exploratory Data Analysis
http://andrisignorell.github.io/DescTools/
82 stars 18 forks source link

Quantile's weights: Confusion #123

Open patrick-weiss opened 11 months ago

patrick-weiss commented 11 months ago

Working with the Quantile() function resulted in some confusion (which might be based on a conceptual misunderstanding): If the sum of weights is 1, the function (type = 7, i.e., default) returns the maximum no matter, which prob are given. Looking at the function's lines 85:86: n <- sum(weights) and ord <- 1 + (n - 1) * probs this seems intended.

The confusion might be related to the definition of weights (where I assumed that weights sum to one by default). However, either a warning/error or a change in the documentation would be appreciated, given that the function turns into max(). Alternatively, multiplying the weights by some sufficiently large number returns the expected result. This multiplication could be forced if(sum(weights) == 1).

Caveat: Might be just a misunderstanding on my side. In this case, I am grateful for some clarification. Thanks for your efforts.

Demonstration: R 4.3.0; DescTools 0.99.49; Windows 11

library(DescTools)
set.seed(2023)
sample <- data.frame("x" = 1:100,
                     "weights" = sample(1:10, 
                                        size = 100, 
                                        replace = TRUE))
sample$weights = sample$weights/sum(sample$weights)

Quantile(sample$x,
         sample$weights,
         probs = seq(0, 1, length.out = 6))

Quantile(sample$x,
         sample$weights*10^6,
         probs = seq(0, 1, length.out = 6))