Open AkiraTOSEI opened 3 years ago
Propose an NGU that uses integrated search rewards for multiple episodes and single episodes each, and It was. The former uses RND, while the latter uses embedded vectors and kNN to find new states. It scored high in pitfall, Montezuma's Revenge.
https://arxiv.org/abs/2002.06038
TL;DR
Propose an NGU that uses integrated search rewards for multiple episodes and single episodes each, and It was. The former uses RND, while the latter uses embedded vectors and kNN to find new states. It scored high in pitfall, Montezuma's Revenge.
Why it matters:
Paper URL
https://arxiv.org/abs/2002.06038
Submission Dates(yyyy/mm/dd)
Authors and institutions
Methods
Results
Comments