It would be interesting to integrate Random Network Distillation policies (https://arxiv.org/abs/1810.12894) to be used with PPO.
Motivation
RND implementation on GitHub are scarse and many of them are very old and not practical to use. Having this features inside stable-baseline3 could improve researches in many field.
Pitch
No response
Alternatives
No response
Additional context
No response
Checklist
[X] I have checked that there is no similar issue in the repo
[ ] If I'm requesting a new feature, I have proposed alternatives
🚀 Feature
It would be interesting to integrate Random Network Distillation policies (https://arxiv.org/abs/1810.12894) to be used with PPO.
Motivation
RND implementation on GitHub are scarse and many of them are very old and not practical to use. Having this features inside stable-baseline3 could improve researches in many field.
Pitch
No response
Alternatives
No response
Additional context
No response
Checklist