xKDR / Survey.jl

Analysis of complex surveys
https://xkdr.github.io/Survey.jl/
GNU General Public License v3.0
53 stars 19 forks source link

Weighted `quantile` implementation #130

Open smishr opened 1 year ago

smishr commented 1 year ago

Currently, Statistics has all the Hyndman & Fan rules, and Statsbase has some rudimentary rule for Weighted (which I have been unable to classify), but from what I could find, those formulae under the "Weighted" column on the right hand side of the table are not implemented in Julia yet. They are straightforward to implement. Later, we could look at encapsulating parts of the code into Statsbase quantile image

Except for simple random samples when qrule="hf7", our quantile function is incorrect. Weighting logic needs to be applied.

smishr commented 1 year ago

qrule function in R, which applies the given qrule, eg "hf7" by default.

function (x, w, p) 
{
  if (any(zero <- w == 0)) {
    w <- w[!zero]
    x <- x[!zero]
  }
  n <- length(x)
  ii <- order(x)
  x <- x[ii]
  cumw <- cumsum(w[ii])
  pk <- c(0, cumw[-n])/cumw[n - 1]
  approx(pk, x, p, method = "linear", rule = 2)$y
}
smishr commented 1 year ago

Related to #87

smishr commented 1 year ago

See R survey implementation and logic of all the qrules. It should be relatively easy to translate the logic into Julia.

smishr commented 1 year ago

See https://docs.julialang.org/en/v1/stdlib/Statistics/ quantile and quantile! functions for unweighted quantile definitions. Using parameters alpha and beta, can convert into weighted by multiplying by cdf and sum cdf

smishr commented 1 year ago

@nadiaenh this is also a good PR to work on to build understanding and skills for the project. The R survey qrule.R is relatively easy to understand and implement in Julia