divilian / specstar

Combines SPECscape and SPECnet into one project
1 stars 1 forks source link

Produce Gini plot with error bars #40

Closed divilian closed 5 years ago

divilian commented 5 years ago

When issue #39 is completed, have param_sweep.jl produce a plot of swept-variable vs. final Gini coefficient, with error bars. (See conversation on #37.)

venkatachalapathy commented 5 years ago

Page 225 of the manual has Gini calculation with confidence intervals

https://cran.r-project.org/web/packages/DescTools/DescTools.pdf

venkatachalapathy commented 5 years ago

As discussed, we need confidence intervals for Gini. This functionality does not exist in ineq, the currently used package for Gini calculations. We need to replace that with DescTools Gini function. Both sim,jl and param_sweep.jl currently use ineq not DescTools.

divilian commented 5 years ago

Is it true, then, statistically speaking, that in order to get valid confidence intervals for a Gini coefficient one needs to use Gini-specific techniques? (Silly me, I figured getting a confidence band for Gini would be much like getting a confidence band for any other quantity. But maybe it's the strange nature of how Ginis are disributed that makes non-Gini-specific techniques illegitimate?)

venkatachalapathy commented 5 years ago

You are right! getting a confidence band for Gini is same as getting a confidence band for any other quantity; just the technique is bootstrapping. Some of the statisticians that I respect use bootstrapping method as a very general purpose as a way to move away from Gaussianity assumptions of the error terms. The more I read about it. the more I agree with that point of view.

Most of old statistics (pre-1980) freely assumed and worked with only normally distributed random variables. When something such as the confidence band needed to be calculated, they just made community approved assumptions and moved on. But it continues to be used even though other techniques that provide systematically more accurate results exist.

divilian commented 5 years ago

Okay, so back to your original point then: why do we need to use DescTools instead of ineq? Can't we just use ordinary bootstrapping to get the confidence intervals, a la a package like https://github.com/juliangehring/Bootstrap.jl ?

jeffg828 commented 5 years ago

I think the standard errors are different for gini coefficients.

On Sat, Jun 15, 2019, 12:10 Stephen Davies notifications@github.com wrote:

Okay, so back to your original point then: why do we need to use DescTools instead of ineq? Can't we just use ordinary bootstrapping to get the confidence intervals, a la a package like https://github.com/juliangehring/Bootstrap.jl ?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/WheezePuppet/specstar/issues/40?email_source=notifications&email_token=ALNPB27EKYOVZQGTSYHBCU3P2UH6PA5CNFSM4HVE7RL2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODXY3DUI#issuecomment-502378961, or mute the thread https://github.com/notifications/unsubscribe-auth/ALNPB24YVC6MT3POI3KBXS3P2UH6PANCNFSM4HVE7RLQ .

venkatachalapathy commented 5 years ago

Okay, so back to your original point then: why do we need to use DescTools instead of ineq? Can't we just use ordinary bootstrapping to get the confidence intervals, a la a package like https://github.com/juliangehring/Bootstrap.jl ?

If you look at p.225 of DescTools manual, their Gini calculation has bootstrap option built into it. Easier?

divilian commented 5 years ago

Helpful note from @venkatachalapathy:

(See p.225 of DescTools manual and http://gadflyjl.org/v0.6/lib/geoms/geom_errorbar.html.) So, in the manual, if we CI is provided as an option, in the function Gini, the output is a 3 element array with [gini lwr.ci upr.ci] . Then we could use Geom.errorbar aesthetic of Gadfly to produce the necessary result.

divilian commented 5 years ago

Added this to single-sim plot in 3adba47. Still need to add it to param sweep plot, although it's not totally clear to me how to do the statistics correctly.

divilian commented 5 years ago

Fixed in dcf3f45. Used Julia Bootstrap package to compute CIs for Gini param sweep.