ctallec / world-models

Reimplementation of World-Models (Ha and Schmidhuber 2018) in pytorch
MIT License
564 stars 129 forks source link

The definition of GMM linear layer may wrong? Or I have missed something? #33

Open Reinelieben opened 4 years ago

Reinelieben commented 4 years ago

hi ctallec, In file mdrnn.py: I observed the neural number in gmm_linear layer is too few, why is the output size defined as (2 latents + 1) gaussians + 2? Shouldn't it be 3 latents gaussians +2 (I also saw this definition in other implementation of mdn-rnn)? In your definition, you seem to share the pis to all gaussian element which is not feasible under my understanding of GMM. My understanding is that, each element of the latent vector has its own GMM, that is, for example, if we have 3 gaussian elements, for each z_i we have 3 mus, 3 sigmas and 3 pis. Or have I had some misunderstandings of GMM? Best,

Reinelieben commented 4 years ago

You could check this implementation https://github.com/zmonoid/WorldModels/blob/master/model.py for more information at line 62

AlexGonRo commented 4 years ago

I am a bit rusty with this library but, if I remember correctly, you are right.

This library uses an output such that each Gaussian mixture has a defined μ and σ for each element of the array z_{t+1}. I don't know about other implementations, but the original World Models uses the one you propose.

I don't think this approach is wrong, it is just a more restrictive one. If you use it with the proposed environments (carRacing and ViZDoom: Take Cover) you won't be able to see the difference.

Reinelieben commented 4 years ago

I am a bit rusty with this library but, if I remember correctly, you are right.

This library uses an output such that each Gaussian mixture has a defined μ and σ for each element of the array z_{t+1}. I don't know about other implementations, but the original World Models uses the one you propose.

I don't think this approach is wrong, it is just a more restrictive one. If you use it with the proposed environments (carRacing and ViZDoom: Take Cover) you won't be able to see the difference. Hi Alex, Thank you so much for the very quick response! I found issue when I was training the rnn, the gmm loss was always relatively high like (0.95), but after I changed the definition and loss function, respectively , I found the gmm loss was reduced to -0.005 which means the likelihood is near 1. Do you have any experience how the value loss should be? And could you give me some suggestion to sample the latent variable from the distribution according to "temperature"? Best regards

AlexGonRo commented 4 years ago

I found issue when I was training the rnn, the gmm loss was always relatively high like (0.95), but after I changed the definition and loss function, respectively , I found the gmm loss was reduced to -0.005 which means the likelihood is near 1. Do you have any experience how the value loss should be?

I have the feeling I'm missing too much information to provide a faithful answer to your question (which environment you are using, size of the latent space, algorithm for the vision module, etc). I'll try to give you some general advice:

And could you give me some suggestion to sample the latent variable from the distribution according to "temperature"?

Well, this is a completely different question. From my experience, there are a couple of things you need to know:

  1. Using τ<1 never helped. The controller just learnt to trick the simulation more consistently.

  2. Training offline (in a dream) was very VERY tricky. Models trained with the same configuration and architecture could yield very different results. And this affected the temperature value too. From my experience, values 1.2 < τ > 1.1 worked best.