Closed FedericoV closed 5 years ago
I also get the same exact behaviour if I get the posterior distribution directly doing this:
obs_marginal = posterior.marginal(sites=["train_obs"])
obs_marginal = torch.cat(list(obs_marginal.support(flatten=True).values()), dim=-1)
obs_marginal = obs_marginal.numpy()
The posterior distribution of an observed variable seems to be incredibly close to the value of the variable used in training plus the observation noise, not what the model is actually predicting.
@FedericoV It seems to me that the line train_pred = posterior_forecaster.run(X_train, X_test, y_train)
should be train_pred = posterior_forecaster.run(X_train, X_test, y_train=None)
. By setting y_train=None
, we tell the model that train_obs
is not an observed node.
Aha, that worked perfectly, thank you. Now, I need to investigate why NUTS seems to work much worse than ADVI for this problem...
Issue Description
Extracting the parameters from an MCMC trace, and then manually running them through the model gives a different answer than extracting the distribution of the observable variables.
Environment
Code Snippet
I have a model with a multi-variate likelihood that looks like this:
If I fit try to sample from the posterior of the model with MCMC doing this:
And plot the results, I get a completely different answer than if I do:
The former shows near perfect fit, the second one shows a very poor fit. I simplified the example somewhat - but hopefully the behaviour is clear.