Review of the lab n.2 (Nim Game) - made by Ivan Magistro Contenta, s314356

Hi Matteo! I want to give you my opinions and advices about your work (lab. 2 about Nim game). Sorry, but they are in a random order.

As regard the README.md:
- it is very detailed and it is really useful to understand the reasoning and the code. You made a very good study of the game, of its strategies and of mathematical concepts, like Nim sum (applying XOR operator).
- I had some problems to understand what Player 0 and Player 1 do, but the code helped me to understand what you mean: Player1 represents your adaptive player with the set of rules and their corresponding weights and Player0 represents every tot games one among optimal, pure_random and gabriele, so the match is between a still implemented player (Player0) and your implemented player (Player1)
- I appreciate the fact that you focused most on one Evolutionary Strategy, than doing all its types, in order to understand how (1+lambda)-ES, self-adaptation and other aspects work.
- I didn't understand why you used this type of mutation step: it seems, as you said, that you begin with exploration and then with a function you obtain the maximum value between two variables. But it seems that it is only a decreasing function, so you pass only from exploration to exploitation, without the possibility to come back to exploration. Moreover it seems that is a function of the number of epochs, so you assume that at an high number of epochs you don't never recur to exploitation.
- I appreciate the plots to see how your best individual grows during the epochs.
As regard the code:
- The flow of code portions is clear. It is easily understandable.
- I appreciate the fact that you use the Roulette Wheel. I have a advice (or a question): maybe if you shuffle the list of actions you could take different actions, instead it seems that the most taken ones are the first ones because the voting() function picks the action which weight, added to the current_sum_weight, is higher than the random threshold (between 0 and 1).
- The references that you put could be very useful. Great!

I hope that you can reply to my message, in order to understand better your approach and implementation. :)

Hi ivan, thanks for your review. So, how am i using the mutation rate?

Starting from this line of code: offspring = [np.clip(np.random.normal(loc=0, scale=σ, size=len(current_solution)) + best_solution, 0.01, None) for _ in range(λ)]

As you can see i start from the current best solution to generate the offspring. Yes the mutation step decreases linearly over the epochs, so at the beginning i start with an higher value for sigma and i want to take small steps in the end. Why i want this kind of behavior? Well, i think that it is important to have exploration at the beginning and once the algorithm is converging in a region of promising solutions, it is not appropriate to make large space. I think that maybe it is more appropiate at that point to search around that area by taking small steps.

Speaking about the roulette wheel: why did I chose this mechanism? Well, roulette wheel selection can facilitate the exploration of different solutions, since even individuals with relatively low fitness have a non-zero chance of being selected. I saw a lot of codes about roulette wheel online (the link in the readme is an example) and in my implementation I wrote this line of code: current_weight_sum += normalized_weight why? In short, current_weight_sum helps determine which action to select by simulating the rotation of the roulette wheel. Its progressive growth ensures that the action with a higher normalized weight has a higher probability of being selected.

Matteo-Pietro-Pillitteri / Computational-Intelligence

Review of the lab n.2 (Nim Game) - made by Ivan Magistro Contenta, s314356 #3