Lab10 Review by Beatrice Occhiena s314971

Hi Davide 😊,

I just finished reviewing your project, and I must say, I am thoroughly impressed! Here are my thoughts:

Theoretical Introduction

Your introduction to Q-Learning and Monte Carlo strategies is commendable. It provides a solid foundation for understanding the rest of your project.

Code Organization

I really appreciated the idea of using an abstract class for different player strategies. As I stated in my notebook, I was so inspired by this approach that I've adopted the same organization in my own project!

Comprehensive Comments

Your comments make it easy to understand the purpose and functionality of each section, which is incredibly helpful not just for reviewers like me, but for anyone who wishes to learn from or build upon your work.

Statistical Analysis and Visualization

The function you implemented to collect game statistics is very useful. It provides valuable insights into the performance of the strategies over time.

However, I suggest adding a graph to visually represent the training trend over time, including wins, losses, and draws. This would not only add to the visual appeal but also make the learning process and performance trends more immediately evident.

Impressive Results

The results you've achieved with both players are fantastic. It's clear that your strategies are effective and well-implemented.

Comparative Analysis

Your final comparison between the two strategies (Q-Learning and Monte Carlo) is a great way to wrap up your project. It gives a comprehensive view of the strengths and weaknesses of each approach and provides valuable insights into their practical applications.

A Small Perplexity

The only area where I have some reservations is regarding the negative reward for invalid moves. I understand the reasoning behind it, but I wonder if it might be more efficient to avoid invalid moves altogether with a preliminary check. Of curse, this would be a more deterministic and rule-based approach, but I think it could streamline the learning process by preventing the agent from exploring obviously undesirable actions, given the well-defined rules of the game. However, I acknowledge that I might be missing some aspects of your strategy here.

Overall, your project demonstrates your skills and understanding of the topic. Awesome job! ✨

FarInHeight / Computational-Intelligence