danijar / dreamerv3

Mastering Diverse Domains through World Models
https://danijar.com/dreamerv3
MIT License
1.28k stars 219 forks source link

Weight Initialization #92

Closed snykral closed 5 months ago

snykral commented 12 months ago

Hi, I'm trying to make this architecture in pytorch.

I have seen the uniform initialization looks like a kaiming uniform, but I'm not sure. Also, I don't know which initialization you use for each layer (e.g, Linear, CNN, GRU).

Could you say what do you recommend me to use? Depending on the initialization I use, I notice the agent can recover from bad starts or cannot.

Furthermore, in the article you say you're using silu, but there is the mish activation function in your code. Did you try and abandon it?

danijar commented 5 months ago

Hi, I recommend to use the defaults! :)