mpnunez / Connect4-AI

Training an AI Player to play Connect4
0 stars 0 forks source link

Fix illegal moves for Policy Gradient #15

Open mpnunez opened 2 months ago

mpnunez commented 2 months ago

Probability of action taken is underestimated, because the action can happen if an illegal move is chosen and then the action is reassigned. Either punish illegally chosen moves, or reweight the probabilities of legal moves.