TuringLang / ParetoSmooth.jl

An implementation of PSIS algorithms in Julia.
http://turinglang.org/ParetoSmooth.jl/
MIT License
19 stars 12 forks source link

`loo_compare` reorders passed array of loo_cv results #57

Closed jonasmac16 closed 2 years ago

jonasmac16 commented 2 years ago

Hi and thank you so much for this package. It allows me to do model comparison directly in Julia rather than calling Stan's loo via RCall.

However, I noticed to that if pass an Array of psis_loo results to loo_compare. The loo_compare functions rearranges the Array. So if you were to repeat the loo_compare again the result will be different as the Array has been reordered according to the model ranking.

Is this intended?

MWE:

julia> a = rand(20, 200, 4);
julia> b = rand(20, 200, 4) .+ 4.0;
julia> c = rand(20, 200, 4);

julia> d=[ParetoSmooth.psis_for j in [a,b,c]]

3-element Vector{ParetoSmooth.PsisLoo{Float64, Array{Float64, 3}, Vector{Float64}}}:
 Results of PSIS-LOO-CV with 800 Monte Carlo samples and 20 data points. Total Monte Carlo SE of 0.046.
┌───────────┬───────┬──────────┬───────┬─────────┐
│           │ total │ se_total │  mean │ se_mean │
├───────────┼───────┼──────────┼───────┼─────────┤
│   cv_elpd │  9.11 │     0.03 │  0.46 │    0.00 │
│ naive_lpd │ 10.77 │     0.03 │  0.54 │    0.00 │
│     p_eff │  1.65 │     0.01 │  0.08 │    0.00 │
└───────────┴───────┴──────────┴───────┴─────────┘

 Results of PSIS-LOO-CV with 800 Monte Carlo samples and 20 data points. Total Monte Carlo SE of 0.045.
┌───────────┬───────┬──────────┬───────┬─────────┐
│           │ total │ se_total │  mean │ se_mean │
├───────────┼───────┼──────────┼───────┼─────────┤
│   cv_elpd │ 89.18 │     0.05 │  4.46 │    0.00 │
│ naive_lpd │ 90.81 │     0.05 │  4.54 │    0.00 │
│     p_eff │  1.64 │     0.01 │  0.08 │    0.00 │
└───────────┴───────┴──────────┴───────┴─────────┘

 Results of PSIS-LOO-CV with 800 Monte Carlo samples and 20 data points. Total Monte Carlo SE of 0.046.
┌───────────┬───────┬──────────┬───────┬─────────┐
│           │ total │ se_total │  mean │ se_mean │
├───────────┼───────┼──────────┼───────┼─────────┤
│   cv_elpd │  9.14 │     0.04 │  0.46 │    0.00 │
│ naive_lpd │ 10.82 │     0.04 │  0.54 │    0.00 │
│     p_eff │  1.68 │     0.01 │  0.08 │    0.00 │
└───────────┴───────┴──────────┴───────┴─────────┘

julia> e = ParetoSmooth.loo_compare(d)
┌─────────┬─────────┬────────┬────────┐
│         │ cv_elpd │ cv_avg │ weight │
├─────────┼─────────┼────────┼────────┤
│ model_2 │    0.00 │   0.00 │   1.00 │
│ model_3 │  -80.04 │  -4.00 │   0.00 │
│ model_1 │  -80.06 │  -4.00 │   0.00 │
└─────────┴─────────┴────────┴────────┘

julia> f = ParetoSmooth.loo_compare(d)
┌─────────┬─────────┬────────┬────────┐
│         │ cv_elpd │ cv_avg │ weight │
├─────────┼─────────┼────────┼────────┤
│ model_1 │    0.00 │   0.00 │   1.00 │
│ model_2 │  -80.04 │  -4.00 │   0.00 │
│ model_3 │  -80.06 │  -4.00 │   0.00 │
└─────────┴─────────┴────────┴────────┘
ParadaCarleton commented 2 years ago

Hi and thank you so much for this package. It allows me to do model comparison directly in Julia rather than calling Stan's loo via RCall.

You're welcome! Always happy to hear that people find my work useful. :)

However, I noticed to that if pass an Array of psis_loo results to loo_compare. The loo_compare functions rearranges the Array. So if you were to repeat the loo_compare again the result will be different as the Array has been reordered according to the model ranking. Is this intended?

Nope! Thanks a bunch for reporting the bug! I've made a PR that should fix this.