HandyRL is a handy and simple framework based on Python and PyTorch for distributed reinforcement learning that is applicable to your own environments.
However, in a game like rock-paper-scissors where the best move depends on the opponent's move, it makes no sense to compare only the probability of one's own move.
How to define
rho
andc
has not a clear answer.However, in a game like rock-paper-scissors where the best move depends on the opponent's move, it makes no sense to compare only the probability of one's own move.