issues
search
kaesve
/
muzero
A clean implementation of MuZero and AlphaZero following the AlphaZero General framework. Train and Pit both algorithms against each other, and investigate reliability of learned MuZero MDP models.
MIT License
148
stars
24
forks
source link
Choose environments
#1
Closed
kaesve
closed
3 years ago
kaesve
commented
3 years ago
Options:
board games
tic tac toe
hex
continuous control
mountain cart
pendulum swing up
bipedal walker (also used in WANN)
Options:
board games
continuous control