Closed return-sleep closed 1 year ago
Hello!
self.active_inference = True
. In practice, both approaches lead to roughly the same performance.Hello!
- The future latent state prediction can either use the ground truth action or the predicted action from the policy (when
self.active_inference = True
. In practice, both approaches lead to roughly the same performance.- and 3. We found that joint training of world modelling (understanding of the world), and policy learning (predict which action to take given the current state) was beneficial. In particular, the inferred state generalised better to unseen towns/weathers.
- What we call "reset state" means we always recompute the state using the full history of image context [o_1, ..., o_t] (so it's computationally more expensive).
Thanks for your response!
No problem!
Thank you for this excellent work. However some problem make me confused.
models/trainsition.py
, it seems like MILE still uses the ground truth action to compute distribution of latent state? Did I miss some critical parts?