Open ImBlurryF4c3 opened 9 months ago
Thank you for the feedback Roberto! About the test of my expert system against the optimal strategy proposed by the professor: I have done some tests right now in local and it seems to win the 100% of the time. I appreciate all the advice on the evolution strategy too, have a good day!
Review of Lab2
made by Roberto Pulvirenti (s317704)
Hi Lorenzo,
I spent some time looking carefully at your code and this is my feedback!
First of all the code is very clear and readable also thanks to the very explanatory comments, this helps a lot for this kind of work because the others can follow the flow of your program easily.
Now I'm going to divide the review into two parts: one referred to the expert system and the other on the evolutionary strategy you implemented.
Expert system
There's no much to say here: the video explains everything and your code is well written. The only thing I would have reccomended is to show how your system competes against the optimal one proposed by the professor since it is the most difficult opponent to defeat, moreover I think it's more accurate showing a statistic on the win rate of your agent against the opponent on a certain amount of games (like you did in the evolutionary strategy with the games() function), because the win on one game doesn't tell a lot about the strength of your agent since also the _purerandom is able to defeat the optimal opponent around the 27% of the times.
(1+Lambda) Evolution Strategy
There are 2 main points that I would like to underline: 1) The mutation made on the weights, of the mutate() function, is taken by a Gaussian distribution centered in 0 (loc = 0) e not in the actual value of the weight at that specific time (loc = weights), this means that your weights will always be in a range of -sigma and +sigma, while they should mutate of that amount on each generation (around the value of weight of that generation) so I think it isn't 100% correct. 2) You "train" your agent always playing as first I think you should let your agent play one game as first player and the next one as second to have a better generalization.
I hope you will find this feedback useful, but in general to me it seems a very good code!