danijar / dreamerv3

Mastering Diverse Domains through World Models
https://danijar.com/dreamerv3
MIT License
1.28k stars 219 forks source link

Clarification on paper/impl mismatch (encoder)? #68

Closed Jogima-cyber closed 1 year ago

Jogima-cyber commented 1 year ago

Hi, sorry to disturb you again, I've been working hard on analyzing your implementation and building a mini-dreamerV3 to get a good hold of how dreamerV3 works. I just got a question regarding the input to the encoder. In the paper, the following is stated:

Capture d’écran 2023-06-16 à 17 25 27

So the encoder should take as input the recurrent hidden state (and the current observation), called "deter" in the implementation. But in the implementation, if I'm not mistaken, the encoder only takes as input the current observation. Do you confirm this or have I misunderstood something in the implementation?

danijar commented 1 year ago

That's just a naming mismatch. The encoder in the code only takes in observations but then passes its features to the RSSM where it gets combined with the RNN state before computing the logits for z_t from it.

Jogima-cyber commented 1 year ago

Oh my bad, sorry, I should have figured it out on my own, got stupidly confused. Thanks for the answer.

danijar commented 1 year ago

No worries at all, thanks for reaching out.