Iterative Training of V?

worldmodels / worldmodels.github.io

World Models

Creative Commons Attribution 4.0 International

430 stars 55 forks source link

Iterative Training of V? #7

Closed ppkn closed 1 year ago

ppkn commented 6 years ago

I like the discussion of an iterative training procedure with M and C, but is it also possible to train the VAE during this iterative process as well? Actively exploring parts of the environment is likely improve the visual part of the world model as the agent discovers new areas.

I'm guessing some thought was put into this that just didn't make it into the final draft of the paper.

hardmaru commented 6 years ago

Hi @dpipkin

Thanks for the message. Not all environments have pixel observations, so in general V may not be needed. So when we wrote that section on iterative training, M is meant to be more general and include V as well. The notion that M includes V originated from the Learning to Think paper, but when I was writing this article and designing the experiments, I think it is more clear to explicitly separate V as well. Apologies for the confusion!