Open michaelosthege opened 2 years ago
Working on this
My opinion is that this kind of visualizations should not be part of ArviZ. Not because I think it's impossible for all cases, but because I think ArviZ doesn't know everything about the model structure in order to be able to generate this kind of plot generically. I'm open to other opinions of course.
I'm not sure which model structures you'd like to adapt it to?
In my experience plot_gp_dist
is very widely applicable. All you need are posterior draws (n_samples, length)
and a vector for the x-axis (length,)
.
I'm using it all the time and never had to adapt it to particular model structures. After all it's just a helper function to plot one variable, and I wouldn't expect it to swallow idata
.
The only change I ever did to plot_gp_dist
was adding dashed lines for HDIs, like in the figures in this notebook.
A different approach by the way would be a histogram instead of a band based on percentiles. It could more accurately reflect multimodal time series.
I think the function is good as it is for the specific case that you have a single group and as a user you're willing to pass the values of the predictor (in most cases it would require users to construct a grid). But, what if, for example, you had multiple groups? Would this function work automatically, or, would it require users to loop, slicing things appropriately before calling the function?
I don't have anything against this kind of functions existing. But given the very particular context they apply to, I would leave them in PyMC.
But, what if, for example, you had multiple groups? Would this function work automatically, or, would it require users to loop, slicing things appropriately before calling the function?
I would see it as the user's responsibility
The motivation behind opening this issue was 98 % because it's the only plotting function in PyMC (apart from model_to_graphviz
) and IMO should have been migrated to ArviZ years ago.
Tell us about it
pymc.gp.util.plot_gp_dist
is a little-known, but very generic plotting function that is useful for GPs, time series, regressions and many more.Not only does it plot a smooth band based on the percentiles, but by default it plots a few posterior draws as inidividual lines. Here's an example with a linear model:
Thoughts on implementation
Basically copying from PyMC (so we can kick it out from the PyMC codebase).
While we're at it, here's a small wishlist of things to improve:
arviz.plot_hdi
) for multiple credible interval levels (ETI or HDI). (A.k.a. drawing lines at custom percentiles.)