A number of agents (PPO, MuZero) with a Perceiver-based NN architecture that can be trained to achieve goals in nethack/minihack environments.
GNU General Public License v3.0
38
stars
4
forks
source link
A bunch of changes to make MuZero really work #2
Closed
hr0nix closed 2 years ago