GRU has non-standard def?

danijar / dreamerv3

Mastering Diverse Domains through World Models

https://danijar.com/dreamerv3

MIT License

1.28k stars 219 forks source link

Closed Jogima-cyber closed 1 year ago

Jogima-cyber commented 1 year ago

Hi, I was wondering if there was a reason for using different formulas for the definition of the GRU? If I'm not mistaken, the standard formulas:

danijar commented 1 year ago

Hi, I think the version I'm using was designed by Nvidia for cuDNN. The motivation is that you compute all relevant quantities in a single matmul.

Jogima-cyber commented 1 year ago

Thanks for the answer!