Refine the `iterate` interface with `Trajectory`

JuliaReinforcementLearning / ReinforcementLearningTrajectories.jl

A generalized experience replay buffer for reinforcement learning

MIT License

8 stars 8 forks source link

Refine the `iterate` interface with `Trajectory` #29

Closed findmyway closed 2 years ago

findmyway commented 2 years ago

For sync controllers, the iterate implementation dropped the state. But in some cases, we need to track the state. A typical case is in VPG or PPO, where we want to iterate through mini-batches without replacement.