JuliaReinforcementLearning / ReinforcementLearningTrajectories.jl

A generalized experience replay buffer for reinforcement learning
MIT License
8 stars 8 forks source link

add episodessampler #51

Closed HenriDeh closed 1 year ago

HenriDeh commented 1 year ago

This PR adds a new sampler that will sample all the episodes in the Trajectory, even the truncated one (the first episode being typically the only one that's truncated in a limited size buffer).

The decision to sample all episodes is due to two reasons:

  1. Algorithms that use whole episode sampling are typically on-policy algorithms (eg. PPO, TRPO) which means that they use the entirety of the buffered transitions then discard them. As such, there is no reason to sample a subset of the buffer.
  2. There is no simple way to sample episodes without replacement from the buffer. As long as no algorithm expressly requires to subset entire Episodes, I don't see a compelling reason to add this functionality.