danijar / dreamerv3

Mastering Diverse Domains through World Models
https://danijar.com/dreamerv3
MIT License
1.28k stars 219 forks source link

GRU has non-standard def? #66

Closed Jogima-cyber closed 1 year ago

Jogima-cyber commented 1 year ago

Hi, I was wondering if there was a reason for using different formulas for the definition of the GRU? If I'm not mistaken, the standard formulas:

Capture d’écran 2023-06-13 à 14 33 07

and the ones used in this implementation: https://github.com/danijar/dreamerv3/blob/423291a9875bb9af43b6db7150aaa972ba889266/dreamerv3/nets.py#L131-L140 are different.

danijar commented 1 year ago

Hi, I think the version I'm using was designed by Nvidia for cuDNN. The motivation is that you compute all relevant quantities in a single matmul.

Jogima-cyber commented 1 year ago

Thanks for the answer!