Closed wcarvalho closed 6 years ago
Hi Wilka,
Thanks for reaching out! Sorry the repo isn't totally clear. I'm in the process of cleaning up the code and improving the documentation with some examples. I should have those up by the end of the week.
I ran a set of experiments with additional inference iterations and step samples, but in practice, using 1 for each of these tends to give decent performance. The overall training procedure is the same for all models and all inference techniques (see /util/train_val.py). For each batch, we run through all of the steps in the sequence. At each step, we perform inference using the inference model. For amortized variational filtering, this inference model is an iterative inference model that takes in the current approximate posterior gradient (and potentially, the data). For the baseline methods, this is a direct inference model that may use an RNN or some other scheme. The inference model can be updated while running on the sequence or after. The generative model is updated after running on the sequence.
The only convolutional model that I used in the experiments was SVG, but as explained in the paper, the original model was trained with a modified objective, so the results for the model trained with the variational objective are not the same as in the original paper. This is just to say that, other than SVG, I don't currently have a working, out-of-the-box, convolutional dynamical latent variable model with AVF.
After I push some updated documentation, let me know if you have any questions or need any help setting things up.
Joe
Hey Joe,
I'm looking through your code trying to understand how to run the AVF algorithm with a convolutional model but am a bit confused about setting that up. It seems that by default you run a VRNN?
I notice that in train_config, inference_iterations=1, and step_samples=1. In practice, do you use 1? Reading your paper, I'm having a hard time figuring out if you run through all of your data until the current time-step for each inference step. My suspicion is that the answer is yes in the general variational filtering case but no in the "amortized variational filtering" case, where you only use the current time-step?
By the way, really neat paper.
Cheers