an exponentially decaying function: e_k = 0.9 * a^k for 0 < a < 1 (source)
How should I implement multiple rules? Rules should change depending on whether an input is timed or what happens when the sums of both players are equal. I'm thinking of subclassing Game and overriding necessary classes/methods, but not sure yet.
How should I design my reward function? Having read an OpenAI post, I'm thinking of...
What should the value of my epsilon be?
e_k = .1 (.05 afterwards)
(source)e_k = 1/k
(source)e_k = 0.9 * a^k for 0 < a < 1
(source)How should I implement multiple rules? Rules should change depending on whether an input is timed or what happens when the sums of both players are equal. I'm thinking of subclassing Game and overriding necessary classes/methods, but not sure yet.
Both players have the same knowledge about their own or the opponent's deck, like the game of Go, Chess, and Gomoku. That means the same algorithm can be applied to both sides in a game, doubling the data received per game. However, I also read that "[Q learning] isn't likely to lead to very good results if you assume that the opponent can also learn. "