SforAiDl / genrl

A PyTorch reinforcement learning library for generalizable and reproducible algorithm implementations with an aim to improve accessibility in RL
https://genrl.readthedocs.io
MIT License
403 stars 59 forks source link

Separate out compute_returns_and_advantage from RolloutBuffer #320

Closed Sharad24 closed 3 years ago

Sharad24 commented 4 years ago

Compute returns and advantage should not be a method of RolloutBuffer. This should be implemented as a separate function or Class to allow for flexibility in extension of this.

There's the usual computation of reward on a trajectory but as an example we also have Generalized Advantage Estimation. GAE could be a separate function/inherting class of the first case.

This, in general, allows the classes to be more composable.

sampreet-arthi commented 3 years ago

Looks like this has been resolved. Reopen if there's something left.