facebookresearch / modeling_long_term_future

Code for ICLR 2019 paper Learning Dynamics Model by Incorporating the Long Term Future
https://arxiv.org/abs/1903.01599
Other
50 stars 9 forks source link

Final loss terms used in code. #1

Open hksBRT opened 5 years ago

hksBRT commented 5 years ago

Could you breakdown the final loss function used in the code, comparing with eq 7 in the paper? I can clearly see the log prob gaussian for forward net, the KL div term, the aux reconstruction term. The z-forcing paper also had the log prob gaussian for backward net in its loss formulation, but I dont understand it in eq7. Also, the action decoder loss is missing.

hksBRT commented 5 years ago

Can you release the source code for the car racing task? I am interested in the decoder parts of the model to predict future observations.

nke001 commented 5 years ago

Could you breakdown the final loss function used in the code, comparing with eq 7 in the paper? I can clearly see the log prob gaussian for forward net, the KL div term, the aux reconstruction term. The z-forcing paper also had the log prob gaussian for backward net in its loss formulation, but I dont understand it in eq7. Also, the action decoder loss is missing.

Hi hksBRT, sorry for the late response. The backward pass does not have an action-conditioned decoder, it is an unconditioned decoder, this is the second last time in Eq 7. In the code, this is reflected in for example here, fwd_nll is the action decoder, bwd_nll is the backward action decoder (which has a weight 0), aux_nll is the auxillary cost for predicting the bwd hidden state, kld is the KL term, bwd_states_nll is the backward state deoder and aux_fwd_l2 is the forward state decoder. https://github.com/facebookresearch/modeling_long_term_future/blob/mujoco/rl_zforcing_cheetah.py#L520 . Sorry for the confusion, I hope this clears ut up.

nke001 commented 5 years ago

Can you release the source code for the car racing task? I am interested in the decoder parts of the model to predict future observations.

Sorry for the delay. We are working on releasing this code soon. We will give you an update in a few days. Thanks.

apsdehal commented 5 years ago

@hksBRT Carracing code is now available in carracing branch.

hksBRT commented 5 years ago

Thanks. I was interested in the aux cost part, even referring back to the z-forcing paper. Here, you seem to not anneal the aux weight. aux_sta = 0.0005, aux_end = 0.0005, aux_step=1e-6 but the deciding condition is max(aux_wt, aux_end) which is always aux_end

nke001 commented 5 years ago

Thanks. I was interested in the aux cost part, even referring back to the z-forcing paper. Here, you seem to not anneal the aux weight. aux_sta = 0.0005, aux_end = 0.0005, aux_step=1e-6 but the deciding condition is max(aux_wt, aux_end) which is always aux_end

Hi, we do not vary in most of our experiments. Although this is a possibility to try. Auxiliary cost is much more meaningful if you already have a good generative model. So you can imagine that you might wanted the model to focus more on producing good 1-step ahead generation in the beginning of training and later when the generative model already does a good enough job, then you can tune the aux_loss to have a higher weight which then allows the latent variables to learn meaningful representations.