labmlai / annotated_deep_learning_paper_implementations

🧑‍🏫 60+ Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), gans(cyclegan, stylegan2, ...), 🎮 reinforcement learning (ppo, dqn), capsnet, distillation, ... 🧠
https://nn.labml.ai
MIT License
53.62k stars 5.54k forks source link

gae formula bug #255

Closed kangnil closed 2 months ago

kangnil commented 3 months ago

https://github.com/labmlai/annotated_deep_learning_paper_implementations/blob/999f2036a5a7c54403352211b5d1cc0df42b83f6/labml_nn/rl/ppo/gae.py#L36

this should be gamma^2 r{t+2} instead of gamma^2 r{t+1}

https://github.com/labmlai/annotated_deep_learning_paper_implementations/blob/999f2036a5a7c54403352211b5d1cc0df42b83f6/labml_nn/rl/ppo/gae.py#L45

this should be wk = (1-lambda) * lambda ^ (k-1)

thanks for everything! this tutorial is gold!!!

vpj commented 2 months ago

Thanks fixed it here https://github.com/labmlai/annotated_deep_learning_paper_implementations/commit/20494ae94cd3e8a73d0007e25343e43e0ee66a14