thu-ml / tianshou

An elegant PyTorch deep reinforcement learning library.
https://tianshou.org
MIT License
7.75k stars 1.12k forks source link

Action mask in Tictactoe #984

Open PingH129 opened 10 months ago

PingH129 commented 10 months ago
Trinkle23897 commented 10 months ago

You need to provide valid action mask as a part of the observation. Please take a look at implementation detail (especially env.step(act)'s signature) in TicTacToe env.