hr0nix / omega

A number of agents (PPO, MuZero) with a Perceiver-based NN architecture that can be trained to achieve goals in nethack/minihack environments.
GNU General Public License v3.0
38 stars 4 forks source link

A bunch of changes to make MuZero really work #2

Closed hr0nix closed 2 years ago