SforAiDl / genrl

A PyTorch reinforcement learning library for generalizable and reproducible algorithm implementations with an aim to improve accessibility in RL
https://genrl.readthedocs.io
MIT License
404 stars 59 forks source link

Off Policy Trainer #367

Closed Sharad24 closed 3 years ago

Sharad24 commented 3 years ago

Off Policy Trainer does not depend on epochs. Has dependence on max_timesteps which even in the worst case should not be without epochs

When working on this, also remove the max_timesteps=100 from every off policy agent test thats added for now to reduce the test times in #368