danijar / dreamerv3

Mastering Diverse Domains through World Models
https://danijar.com/dreamerv3
MIT License
1.28k stars 219 forks source link

[Question] How is symlog used in training? #83

Closed ahan98 closed 5 months ago

ahan98 commented 1 year ago

Hi, I had some conceptual questions about the role of symlog, specifically for training the decoder.

In the code, it looks like SymlogDist.log_prob() computes the MSE of the symlog-transformed target, as described in equation 1 of the paper.

So, unlike the distributions included in torch.distributions, log_prob in this case is really a custom score, rather than the literal log likelihood, right? Then, in equation 5, does that mean the first term of the prediction loss is this symlog MSE, rather than an actual log likelihood?

I'm also confused about what the decoder output means. For example, in DreamerV2, "The image predictor outputs the mean of a diagonal Gaussian likelihood with unit variance" (page 4, in the paragraph on "Distributions").

Unlike DreamerV2, it seems like the decoder in DreamerV3 is no longer doing PDF estimation (especially since symlog is only for non-image vector observations), but rather learning to regress symlog(x_t) given h_t, z_t. Am I understanding this correctly?

Thanks

danijar commented 5 months ago

Hi, yes the image decoder just uses MSE and the vector decoder uses MSE in symlog space. The MSE gradient is equivalent to that of a diagonal Gaussian logprob.