TuringLang / docs

Documentation and tutorials for the Turing language
https://turinglang.org/docs/tutorials/docs-00-getting-started/
MIT License
225 stars 97 forks source link

Better describe how to get data out of `chain` elements in tutorials #483

Open Doublonmousse opened 4 months ago

Doublonmousse commented 4 months ago

In the tutorial https://turinglang.org/v0.30/tutorials/10-bayesian-differential-equations/ , there is very little detail on how to obtain the parameters shown in the summary statistics (so that you may end up being stuck with a pretty print but no obvious way to access the information that is printed)


Summary Statistics
  parameters      mean       std      mcse   ess_bulk   ess_tail      rhat 
  e ⋯
      Symbol   Float64   Float64   Float64    Float64    Float64   Float64 
    ⋯

           σ    0.8440    0.0640    0.0086    51.8543    50.1666    1.0926 
    ⋯
           α    1.5362    0.1946    0.0292    44.8038   115.2958    1.3519 
    ⋯
           β    1.0710    0.1563    0.0229    46.9437   158.4708    1.3206 
    ⋯
           γ    3.0231    0.2973    0.0438    48.2945   132.1955    1.3178 
    ⋯
           δ    0.9911    0.2736    0.0420    45.0081   125.5487    1.3476 
    ⋯
                                                                1 column om
itted

Quantiles
  parameters      2.5%     25.0%     50.0%     75.0%     97.5%
      Symbol   Float64   Float64   Float64   Float64   Float64

           σ    0.7221    0.8021    0.8416    0.8840    0.9851
           α    1.2433    1.3775    1.5346    1.6588    1.9596
           β    0.8278    0.9431    1.0582    1.1714    1.4067
           γ    2.4578    2.8127    3.0105    3.2702    3.5290
           δ    0.5137    0.7841    0.9607    1.2019    

- [ ] 1.5055

I think one should add that after that line of code

# Sample 3 independent chains.
chain2 = sample(model2, NUTS(0.45), MCMCSerial(), 5000, 3; progress=false)

that one can obtain these tables by calling

describe(chain2)

and/or

summarystats(chain2)

That you can estimate the mean parameters in the table with

mean(chain2[:α])

and that you can run

rhat(chain2)

to obtain the rhat metric.

This also points to a larger issue : This page https://turinglang.org/MCMCChains.jl/stable/diagnostics/#MCMCDiagnosticTools.rhat-Tuple{Chains} is not searchable from https://turinglang.org so that any user searching rhat in the search bar would find no information whatsoever.

The same is true for describe and summarystats, found in another page of the MCMCCChain documentation https://turinglang.org/MCMCChains.jl/stable/stats/#StatsBase.summarystats but not searchable from the turinglang.org website.

Although the MCMCChain documentation page is part of the website (go to Library API > Diagnostics > MCMCChains) it is not included in search results from turinglang.org, so one would be forgiven for thinking that there was no documentation on these functions when using the website.

So ideally what I would propose is