xKDR / Survey.jl

Analysis of complex surveys
https://xkdr.github.io/Survey.jl/
GNU General Public License v3.0
50 stars 19 forks source link

Abstract types for summary statistics function for code reuse #275

Open smishr opened 1 year ago

smishr commented 1 year ago

Right now we have four summary stats functions (mean,total,quantile,ratio).

Adding a feature, say confidence intervals ( #274 #184 ) to summary stats currently requires multiple dispatch on each of the four functions.

If they shared a common abstract type, then less code will need to be written.

eg. confint(x,design,::SummaryStat) and confint(Vector{x},design,::SummaryStat) could work for all four methods?

@ayushpatnaikgit @codetalker7

asinghvi17 commented 1 year ago

Just FYI you can also dispatch on functions, so if you don't mind repeating the mean calculation, you could simply dispatch confint as confint(x, design, ::typeof(mean)) etc.

ayushpatnaikgit commented 1 year ago

Based on #277

We need types for estimates.

Currently, mean(x::Symbol, design::ReplicateDesign) returns a DataFrame. Similarly, total(x::Symbol, design::ReplicateDesign) also returns a DataFrame. If we want a function, such as CI that returns the confidence interval, our framework forces us to all CI separately for both function.

If we have

abstract type AbstractEstimate end
struct Estimate{statistic}
   statistic_type::statistic
   value::Number
   SE::Number
end
Base.@kwdef struct Mean 
    name = "mean"
end

Base.@kwdef struct Total 
    name = "mean"
end

Base.@kwdef struct Quantile 
    name = "Quantile"
    p = 0.5
end

This allows us to define

    function CI(x::Estimate)  
    ...
end

function Base.show(IO, x::Estimate) 
    df = DataFrame(x.statistic_type.name = x.value, SE = x.SE) 
    print(df)
end

And there can be functions specific to the estimators, like

function some_function(x::Estimate{Quantile})
    return x.statistic_type.p
end
ayushpatnaikgit commented 1 year ago

I will implement this, and later we can decide on something better.

ayushpatnaikgit commented 1 year ago

Implementing the following:

abstract type AbstractEstimate end

struct Estimate{statistic} <: AbstractEstimate
    statistic_type::AbstractStatistic
    estimate::Number
end

struct EstimateStdErr{statistic} <: AbstractEstimate
    statistic_type::AbstractStatistic
    estimate::Estimate
    stderr::Number
end

struct EstimateStdErrCI{statistic} <: AbstractEstimate
    statistic_type::AbstractStatistic
    estimate_stderr::EstimateStdErr
    CI::Tuple
end

struct Estimates{statistic} <: AbstractEstimate
    statistic_type::AbstractStatistic
    estimates::Vector{Estimate}
end 

struct EstimatesStdErrs{statistic} <: AbstractEstimate
    statistic_type::AbstractStatistic
    estimates_stderrs::Vector{EstimateStdErr}
end

struct EstimatesStdErrsCIs{statistic} <: AbstractEstimate
    statistic_type::AbstractStatistic
    estimates_stderrs_cis::Vector{EstimateStdErrCI}
end

abstract type AbstractStatistic end

struct Mean <: AbstractStatistic
    name = "Mean"
end

struct Total <: AbstractStatistic
    name = "Total"
end

struct Quantile <: AbstractStatistic
    name = "Quantile"
    p = 0.5
end

struct Coefficient <: AbstractStatistic
    name = "Coefficient"
end

@smishr @nadiaenh please give suggestions.