Lab 10 Review by Giuseppe Nicola Natalizio (s305912)

Hello there, Lorenzo! Your code was very well written and well documented in the report. I liked how you use different agents trained against each other. I think the reason behind rl_base outperforming rl_RL_trained is that the former, being trained only against random agents, chooses more "reckless" moves, that favor it against a random player, on the contrary I expect rl_RL_trained to be better against a real player. Counter-intuitevely the_ROCK outperforms even rl_base, I think this is because it retains some important knowledge obtained while playing against rl_RL_trained that help it win in a more consistent manner by applying some "rule-of-thumb" rule learned during training (like prioritizing corners rather than sides). I think it would be interesting to see how many moves does it take on average for each agent to win to determine how aggressive/defensive each one is. I hope this review was useful to you and good luck on the final project. 😊

anubis09 / Computational-Intelligence

Lab 10 Review by Giuseppe Nicola Natalizio (s305912) #4