danijar / dreamerv3

Mastering Diverse Domains through World Models
https://danijar.com/dreamerv3
MIT License
1.28k stars 219 forks source link

[Question] Question about the implementation of the Critic in "Mastering Diverse Domains through World Models" #88

Closed ExuberantWitness closed 1 year ago

ExuberantWitness commented 1 year ago

Dear Project Maintainer,

I've been studying your code implementation based on the article "Mastering Diverse Domains through World Models". During my reading and comprehension of the code, I noticed that the "critic" part does not seem to be distinctly represented. However, in the official implementation of the code, the "critic" is a crucial component.

This has left me somewhat puzzled, as it might imply that there are some disparities between your implementation and the official one. Could you please clarify if this was an intentional design decision, or if I might have missed the relevant part of the implementation?

If it's the former, would you kindly elaborate on the rationale behind this decision? If it's the latter, could you point me to where I could find the implementation of the critic?

I greatly appreciate your assistance and look forward to your response.

schneimo commented 1 year ago

Hi, I am not the author of the code. This is @danijar.

Let me ask you a question back: What do you mean with official implementation? If I am correct, there does not exist any official implementation from Deepmind. This implementation is most likely the one which comes closest to an official one since it is from @danijar (the main author of the paper).

If I am not missing anything else, this should be the implementation of the critic as described in the paper (until EOF).

danijar commented 1 year ago

Hi, the critic is called VFunction in the code.