This little project uses a class TicTacToe to manage the game state, players, and moves.
The code initializes the game with a hidden board, representing values associated with each move, and an empty playable board. Players take turns making moves, and the game checks for a winner or a draw after each move.
The code presents an implementation of a machine learning agent for the game of tic-tac-toe, based on model-free Q-learning with some minmax strategies. The agent plays against itself to learn and then competes against a human player or performs performance evaluations against random choices. Among the positive characteristics we can include that the agent implements Q-learning, a model-free machine learning technique, to improve its moves in the tic-tac-toe game. The code incorporates exploration and deepening strategies via the epsilon parameter, allowing the agent to balance the exploration of random moves and the use of acquired knowledge.The code manages the flow of the game, keeping track of the agent's states, moves and training rewards. The code includes a loop of performance tests in which the agent competes against random players as both the first and second players. To enhance the agent's evaluation, it would be beneficial to incorporate specific metrics during its training. For instance, we could monitor its learning ability over time by observing how scores or the percentage of wins against random opponents improve. A more detailed analysis of its evolution could involve tracking metrics such as the variation in Q-values, the frequency of random choices versus Q-value-based choices, and the understanding of specific strategies learned during the training process.
This little project uses a class TicTacToe to manage the game state, players, and moves. The code initializes the game with a hidden board, representing values associated with each move, and an empty playable board. Players take turns making moves, and the game checks for a winner or a draw after each move. The code presents an implementation of a machine learning agent for the game of tic-tac-toe, based on model-free Q-learning with some minmax strategies. The agent plays against itself to learn and then competes against a human player or performs performance evaluations against random choices. Among the positive characteristics we can include that the agent implements Q-learning, a model-free machine learning technique, to improve its moves in the tic-tac-toe game. The code incorporates exploration and deepening strategies via the epsilon parameter, allowing the agent to balance the exploration of random moves and the use of acquired knowledge.The code manages the flow of the game, keeping track of the agent's states, moves and training rewards. The code includes a loop of performance tests in which the agent competes against random players as both the first and second players. To enhance the agent's evaluation, it would be beneficial to incorporate specific metrics during its training. For instance, we could monitor its learning ability over time by observing how scores or the percentage of wins against random opponents improve. A more detailed analysis of its evolution could involve tracking metrics such as the variation in Q-values, the frequency of random choices versus Q-value-based choices, and the understanding of specific strategies learned during the training process.