The code is very well written, easy to comprehend thanks to the comments and the descriptions which are very accurate.
Your way of dealing with terminal conditions as hybrid between ES and rule-based AI is a smart approach that I believe improves the performance of the algorithm.
Also the linear convergence of the tweak_factor will lead to a better solution.
The idea of choosing simple strategies to choose among when choosing the strategy seems a good idea, but I think it could be improved by giving a percentage for different stages of the game (like when the nim_sum is zero/not zero or for each state of the game, more computationally expensive).
The high selective pressure may prefer exploitation rather than exploration.
Hi Vincenzo,
thank you for your feedback! I will surely try to follow your tips and put more effort developing and benchmarking variations for my future works!
Review regarding Lab02 - Nim:
Best Regards, Vincenzo Micciche'