Open lucianopaz opened 3 years ago
I think the ideal situation here would be to implement https://arxiv.org/abs/2103.10522 to compare several (or all) of the posterior predictive distributions to the observed one. The hdi of kde lines looks nice, but I don't think there is any guarantee that kde lines of the same distribution will lie completely inside the shaded region with 95% probability.
We can definitely add the option for the kde hdi shaded area but we have to be careful in how we document it.
I agree that the paper's method is a great posterior predictive check in itself. My proposal was much more mundane.
Regarding the problem about the kde lines not being in the region of 95% probability, I think that it should happen only 5% of the time. What I thought could be an issue is that the hdi region could be artificially large due to multimodality, for example consider this graph:
The hdi bounds will cover an interval where the posterior predictive samples will never be seen:
Regarding the problem about the kde lines not being in the region of 95% probability, I think that it should happen only 5% of the time.
I would bet against this actually, we are calculating the hdi on a per gridpoint basis, so while we can say tht 5% of the lines will be outside the hdi region at a given point, I don't think we can say anything about whole lines being outside the hdi region at any point of the grid. I think that for us to say that the shaded area is the 95% hdi region, it should be interpreted that kdes will be completely inside the shaded area with 95% probability, I don't know how the gridpoint hdi probability can be translated to probabilities on whole kde lines but I am quite sure those should be different. I'll try to run some example on that.
It looks like the coverage on whole kde lines is indeed far from the "nominal" hdi when generating areas with this method.
import arviz as az
import numpy as np
import matplotlib.pyplot as plt
rng = np.random.default_rng()
u = rng.uniform(0, 1, size=(1, 1000, 200))
grid, _ = az.kde(u[0,0,:], custom_lims=(0, 1)) # setting custom lims to make sure all kdes use the same grid
kdes = np.array([az.kde(u[0, i, :], custom_lims=(0, 1))[1] for i in range(1000)])
# check things look ok
plt.plot(grid, kdes.T, color="C0")
plt.show()
hdi = az.hdi(kdes[None, :, :])
np.mean([np.all(kdes[i, :] > hdi[:, 0]) & np.all(kdes[i, :] < hdi[:, 1]) for i in range(1000)])
# Output
# 0.534
It looks like the coverage on whole kde lines is indeed far from the "nominal" hdi when generating areas with this method.
import arviz as az import numpy as np import matplotlib.pyplot as plt rng = np.random.default_rng() u = rng.uniform(0, 1, size=(1, 1000, 200)) grid, _ = az.kde(u[0,0,:], custom_lims=(0, 1)) # setting custom lims to make sure all kdes use the same grid kdes = np.array([az.kde(u[0, i, :], custom_lims=(0, 1))[1] for i in range(1000)]) # check things look ok plt.plot(grid, kdes.T, color="C0") plt.show() hdi = az.hdi(kdes[None, :, :]) np.mean([np.all(kdes[i, :] > hdi[:, 0]) & np.all(kdes[i, :] < hdi[:, 1]) for i in range(1000)]) # Output # 0.534
I think this code is testing whether the whole graph falls inside the HDI, which isn't an especially useful diagnostic (if the KDE is mostly inside the HDI, it's probably good and the bits of it sticking out are probably random chance). My intuitive understanding of what an HDI for a PPC looks like is that if I have a 94% HDI, 94% of my KDE should fall inside the HDI on average if my model is solid. This seems much easier to interpret than the mess of lines that gets returned at the moment.
Yes, I am testing that the whole kde line is inside the hdi shaded area, which is what I believe should be considered and the only clear and interpretable diagnostic. Again, in my opinion the ideal solution to this is using https://arxiv.org/abs/2103.10522 though, not spaghetti plots nor the hdi proposed here (nor variations on that to try and fix the hdi region to account for whole lines).
I do believe it could be useful to add this, but we have to be careful on that because it is no clear at all what exactly does the hdi shaded area represent nor how to interpret lines going outside that region.
My intuitive understanding of what an HDI for a PPC looks like is that if I have a 94% HDI, 94% of my KDE should fall inside the HDI on average if my model is solid.
I don't think that's true either and I am also quite sure about this. kdes are continuous lines, so the probability of the 2nd point being outside the region given the 1st one was outside is different that if the 1st one was inside, they are not independent values. Yet, we are calculating the hdi as if they were. Moreover, even if the hdi had this interpretation you mention, I don't think having users estimate by themselves (and visually) if 95% or 90% of the kde line is outside the hdi region is a good idea.
As a side note on spaghetti plots, interpreting with an animation could be useful. Imagine the following situation. You start the animation with an spaghetti plot with 100 kde lines from the posterior predictive, then 10 more lines are added to the plot one by one, 9 come from the posterior predictive too and one is the one corresponding to the observations. You have to try and guess which is the kde of the observed data. Once the 10 lines have been added, the plot is updated to highlight the observed kde. If you guessed which one it is (and if in general anyone can guess) then your model is not reproducing the generative process correctly, if generally it's not possible to know which is which your model is probably ok.
Tell us about it
plot_ppc
is very nice, you can control thekind
of plot and also the number of posterior predictive lines to draw withnum_pp_samples
. I would like to ask if it would be possible to add an option to plot credible intervals instead of single draws from the posterior predictive. Consider the following example:The resulting plot is something like this
All of the individual lines from the posterior predictive samples are quite hard to read, and it's hard to make sense of how likely it is to find a sample in a given interval.
I would like to plot something like this:
where the filled in area is the posterior predictive's HDI at a certain level.