Open mpnunez opened 2 months ago
Probability of action taken is underestimated, because the action can happen if an illegal move is chosen and then the action is reassigned. Either punish illegally chosen moves, or reweight the probabilities of legal moves.
Probability of action taken is underestimated, because the action can happen if an illegal move is chosen and then the action is reassigned. Either punish illegally chosen moves, or reweight the probabilities of legal moves.