Open CornfileChase opened 1 month ago
I suppose that some state-action pair that from very short episode is not helpful to the training process
why not? failures are important to learn what not to do.
I want to filter the experience from these episodes and do not push into the replay_buffer,
If you want to do that, you would need to fork SB3 or subclass SB3 algorithms.
❓ Question
In the self-build custom environment, I suppose that some state-action pair that from very short episode is not helpful to the training process, thus I want to filter the experience from these episodes and do not push into the replay_buffer, can anyone tell me how to implement this function? thanks!
Checklist