Open hksBRT opened 5 years ago
Can you release the source code for the car racing task? I am interested in the decoder parts of the model to predict future observations.
Could you breakdown the final loss function used in the code, comparing with eq 7 in the paper? I can clearly see the log prob gaussian for forward net, the KL div term, the aux reconstruction term. The z-forcing paper also had the log prob gaussian for backward net in its loss formulation, but I dont understand it in eq7. Also, the action decoder loss is missing.
Hi hksBRT, sorry for the late response. The backward pass does not have an action-conditioned decoder, it is an unconditioned decoder, this is the second last time in Eq 7. In the code, this is reflected in for example here, fwd_nll is the action decoder, bwd_nll is the backward action decoder (which has a weight 0), aux_nll is the auxillary cost for predicting the bwd hidden state, kld is the KL term, bwd_states_nll is the backward state deoder and aux_fwd_l2 is the forward state decoder. https://github.com/facebookresearch/modeling_long_term_future/blob/mujoco/rl_zforcing_cheetah.py#L520 . Sorry for the confusion, I hope this clears ut up.
Can you release the source code for the car racing task? I am interested in the decoder parts of the model to predict future observations.
Sorry for the delay. We are working on releasing this code soon. We will give you an update in a few days. Thanks.
@hksBRT Carracing code is now available in carracing branch.
Thanks. I was interested in the aux cost part, even referring back to the z-forcing paper. Here, you seem to not anneal the aux weight. aux_sta = 0.0005, aux_end = 0.0005, aux_step=1e-6 but the deciding condition is max(aux_wt, aux_end) which is always aux_end
Thanks. I was interested in the aux cost part, even referring back to the z-forcing paper. Here, you seem to not anneal the aux weight. aux_sta = 0.0005, aux_end = 0.0005, aux_step=1e-6 but the deciding condition is max(aux_wt, aux_end) which is always aux_end
Hi, we do not vary in most of our experiments. Although this is a possibility to try. Auxiliary cost is much more meaningful if you already have a good generative model. So you can imagine that you might wanted the model to focus more on producing good 1-step ahead generation in the beginning of training and later when the generative model already does a good enough job, then you can tune the aux_loss to have a higher weight which then allows the latent variables to learn meaningful representations.
Could you breakdown the final loss function used in the code, comparing with eq 7 in the paper? I can clearly see the log prob gaussian for forward net, the KL div term, the aux reconstruction term. The z-forcing paper also had the log prob gaussian for backward net in its loss formulation, but I dont understand it in eq7. Also, the action decoder loss is missing.