JuliaReinforcementLearning / ReinforcementLearningTrajectories.jl

A generalized experience replay buffer for reinforcement learning
MIT License
8 stars 8 forks source link

Fix NStepBatchSampler sampling out of bounds indicies #74

Open ludvigk opened 1 month ago

ludvigk commented 1 month ago

When stacksize > 1, NStepBatchSampler samples inds < stacksize, which causes out of bounds errors.

codecov[bot] commented 1 month ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 74.23%. Comparing base (29a6a3e) to head (4faf656). Report is 1 commits behind head on main.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #74 +/- ## ======================================= Coverage 74.23% 74.23% ======================================= Files 18 18 Lines 850 850 ======================================= Hits 631 631 Misses 219 219 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

jeremiahpslewis commented 1 month ago

@ludvigk Thanks! Can you please add a test that confirms the fix works?

ludvigk commented 3 weeks ago

When updating the test it seemed to me as if the episode buffer has a size of capacity + 2. After pushing 12 or more states to the buffer with capacity 10, valid_range returns an array with 12 elements instead of 11. I'm not 100% sure where the extra +1 comes from though.

Clearly I broke something else, so my "fix" was not correct.