arviz-devs / arviz

Exploratory analysis of Bayesian models with Python
https://python.arviz.org
Apache License 2.0
1.58k stars 394 forks source link

Upgrade Forest and Ridge plots documentation #2153

Open asael697 opened 1 year ago

asael697 commented 1 year ago

issue

Possible solution:

Generate forest and ridge plots to compare distributions from a model or list of models. 

The `forestplot` option generates credible intervals, where the central points are the estimated posterior means,
the thick lines are the central quartiles, and the thin lines represent the $100\times$(`hdi_prob`)% HDI intervals. 
The `ridgeplot` option generates KDE posterior densities truncated at 0.94 probability.

Additionally, the function displays effective sample sizes (ess) and Rhats to visualize the convergence diagnostic.

issue

Possible examples:

Convergence diagnostic forest plot

.. plot::
   :context: close-figs

 >>> axes = az.plot_forest(non_centered_data,
 >>>                                 combined=False,
 >>>                                  ess = True,
 >>>                                  figsize=(9, 7))
 >>> axes[0].set_title('Estimated theta for 8 schools model')

Convergence diagnostic Ridge plot

.. plot::
   :context: close-figs

 >>> axes = az.plot_forest(non_centered_data,kind='ridgeplot',
 >>>                                 combined=True,
 >>>                                 ess = True,
>>>                                  r_hat = True,
 >>>                                 figsize=(9, 7))
 >>> axes[0].set_title('Estimated theta for 8 schools model')
OriolAbril commented 1 year ago

I would make some changes to the description, but keep the general idea and improvements

Generate forest or ridge plots to compare distributions from a model or list of models. 

The `kind="forestplot"` generates credible intervals, where the central points are the estimated posterior means,
the thick lines are the central quartiles, and the thin lines represent the $100\times$(`hdi_prob`)% highest density intervals. 

The `kind="ridgeplot"` option generates density plots (kernel density estimate or histograms) in the same graph.
Ridge plots can be configured to have different overlap, truncation bounds and quantile markers.

Additionally, the function can display effective sample sizes (ess) and Rhats to visualize convergence diagnostics alongside the distributions.

Ridgeplots can be truncated or not, that depends on ridgeplot_truncate, they can also be histograms instead of kdes. I have also expanded the acronyms, but here I am not sure about using the expanded versions or using abbr role.

Extra notes: I would also add arviz.summary to the see also section. And update the examples: the title doesn't match the plot in many cases, it is better to not have a title than an incorrect one I think. I would also simplify the first example to show the default behaviour, and focus on arguments that are unique to plot_forest. rope works the same in plot_posterior, so it can be moved to https://python.arviz.org/en/stable/user_guide/plots_arguments_guide.html and add the link in its description. The two additions look good, but they might be a bit too similar as there already are examples of how to use kind, maybe we could have a single example using both ess and rhat (with either interval or ridge)

asael697 commented 1 year ago

I agree with your comments and updates. For the acronyms I would prefer abbr role rather than expanded versions. It seems more organic

asael697 commented 1 year ago

The first parts of this issue are solved in PR#2208 and PR#2210, after these two are solved and merged, we will need to add the example and solve the acronyms

tomicapretto commented 1 year ago

2208 and #2210 are already merged. Should this be closed?

OriolAbril commented 1 year ago

The description (highlighted part) still needs some love, the rest has already been fixed I think.

imatge