Closed Sharad24 closed 3 years ago
Off Policy Trainer does not depend on epochs. Has dependence on max_timesteps which even in the worst case should not be without epochs
max_timesteps
epochs
When working on this, also remove the max_timesteps=100 from every off policy agent test thats added for now to reduce the test times in #368
max_timesteps=100
Off Policy Trainer does not depend on epochs. Has dependence on
max_timesteps
which even in the worst case should not be withoutepochs
When working on this, also remove the
max_timesteps=100
from every off policy agent test thats added for now to reduce the test times in #368