marcocolangelo / Computational-Intelligence

The content of this repository will be inherent to the Computational Intelligence course at Polytechnic University of Turin academic year 2023/2024
MIT License
2 stars 0 forks source link

Lab 10 Review by Giuseppe Nicola Natalizio (s305912) #2

Open GNNatan opened 8 months ago

GNNatan commented 8 months ago

Hello there, Marco! I liked your idea of using Prioritized Experience Replay to improve the learning process, the concept is quite new to me so I found your work very interesting. There are some things I'd like to point out:

I hope this review was useful to you and good luck on the final project. 😊

marcocolangelo commented 8 months ago

Hi, thank you for your precious comment.

Your consideration was about the number of possible states of the tic-tac-toe board, which is a square grid of 3x3 cells. Each cell can be empty, occupied by a cross or by a circle, so there are 3 possibilities for each cell. The total number of possible states is therefore given by 3 raised to the power of 9, which is equal to 19,683.

However, many of these states are equivalent to each other, if you consider the symmetries of the board, such as rotations and reflections. In other words, two states are equivalent if you can obtain one from the other by rotating or flipping the board. 

If you want to count only the distinct states, without considering the symmetries, the number is reduced to 7653.

If instead you want to count only the legal states, that is, those that can be reached by following the rules of the game, the number is reduced further to 255,1684. This is because not all states are possible in a real game, for example those in which there are more crosses than circles, or those in which there are more than one row of three equal symbols.

I didn't think of these forms of optimization at first, so I simply used a matrix considering all the 19683 possible states but yes, you're basically right, thank you for having reported it.