Simplify exploitability computation when players are symmetric

TheoCabannes / open_spiel

OpenSpiel is a collection of environments and algorithms for research in general reinforcement learning and search/planning in games.

Apache License 2.0

0 stars 0 forks source link

Open TheoCabannes opened 3 years ago

TheoCabannes commented 3 years ago

This can be done by computing one BR when the player are symmetric

TheoCabannes commented 2 years ago

We do Monte Carlo Tree Search to get expected return. Then we add random noise to the policy of one player, to get local gradient value.

TheoCabannes commented 2 years ago

Done on a Jupiter notebook, does not worth pushing on upstream