Closed EdwardRaff closed 1 year ago
you can fit a parametric model using SVI and then use the resulting density as a new prior. however this is generally expected to work pretty poorly, at least if you're looking for high-fidelity posterior approximations a la long runs of MCMC. in other words you're going through a parametric bottleneck so it's not clear if doing MCMC is really worth it. certainly it can't "rescue" you from any misfit in the parametric fit. might be more sensible to just stick to SVI throughout
We really want to have high-quality posteriors as this is explorative research in that respect.
Are you referring to using the svi_state
object when calling update
function manually?
i'm not referring to any code i'm referring to algorithms. my point is just that mcmc gets its nice asymptotic guarantees from its non-parametric nature. if there's a parametric bottleneck in there you will lose those asymptotic guarantees unless you do something (like importance sampling) to correct for mismatch in the parametric approximation. of course you can use a flexible estimator like a normalizing flow or what not and hope for the best but it's not clear a priori if all that effort will yield a result that outperforms an approach based purely on variational inference from the get go.
I think maybe we are using different definitions of parametric? I mean that the number of parameters in the model will be fixed irrespective of how much data I want to train on. Where a GP your number of parameters is a function of the number of rows of data.
@EdwardRaff can you please ask a concrete question on our forum? github issues are intended for bug reports, feature requests, etc.
in particular i'm unclear if you're asking about generic algorithm advice or particular implementation details. if the latter you would need to be more specific about what algorithm you want to implement
I have some (larger) initial amount of data
X_big
that I want to train my model with. Afterward, I have many rounds of a smaller amount of new dataX_small_i
. I'd like to be able to sequentially use the posterior from the prior round as the prior for the next round, updating the model more quickly each time (re-training from scratch isn't viable).Is there a way to do this in Numpyro today? I don't need to use any GP which would be problematic with the gram matrix. So it would be a parametric model.