arviz-devs / arviz

Exploratory analysis of Bayesian models with Python
https://python.arviz.org
Apache License 2.0
1.59k stars 395 forks source link

Add property to remember the list of free RVs #1748

Open michaelosthege opened 3 years ago

michaelosthege commented 3 years ago

Tell us about it

By passing var_names = [rv.name for rv in pmodel.free_RVs] I can limit, for example, plot_trace to exclude deterministic variables, but it needs the model object. After loading an InferenceData from a file this information is no longer around.

Thoughts on implementation

An attribute such as idata.posterior.free_RVs could be attached and used by some ArviZ plots as the default var_names.

OriolAbril commented 3 years ago

xarray datasets allow storing any attributes in the dataset.attrs dict, where we store the versions, time... I think ideally this would go there.

There is one caveat however which is that lists may not be allowed as attrs by netcdf, so we should probably extend to/from_netcdf to optionally convert/parse lists/dicts to/from strings using json for example. I think that would also be useful for some custom stan attributes like init point or mass matrix that IIRC are currently stored as string attrs.

michaelosthege commented 3 years ago

Oh should I have opened the issue in PyMC3, now that the converter lives there? But the idea with the JSON also sounds like something that could generalize - just giving the converters a way to add custom fields.

ahartikainen commented 3 years ago

What is a free_RV? Could we use similar logic for sparse datastructures from Stan side (e.g. matrices with half containing nan -> cholesky)

michaelosthege commented 3 years ago

In PyMC3 terminology the free RVs are the input variables to the model. The sampler is concerned with these variables, and while deterministics are also part of the MCMC trace, they are often not interesting to look at in a traceplot.

OriolAbril commented 2 years ago

related to https://github.com/arviz-devs/arviz/issues/420