Lab 3 review - Githubissues

Task 1

I love the trial-and-error way in which you developed your fixed-rules solution. I am curious how the final version of count_and_decide would perform against the optimal strategy.

Task 2

Your solution reached immediately a high fitness value and then stopped growing: the typical early convergence problem. This can be easily solved by increasing the population size and using a (mu, lambda) EA with lambda > 5mu*. If you want more exploration than that, you could also implement some quick diversity promotion strategy like extinction.

Task 3

Great implementation, nothing to say. The strange phenomenon relative to the number of heaps might be due to the horizon effect: try to implement a Monte Carlo tree search when the minmax algorithm reaches interesting nodes.

Task 4

Given how the evaluate function works, the first player to move always use the strategy passed as the first argument. Thus, if your RL agent always does the best move, the optimal_strategy agent can't do anything but choose a random action and will perform similarly to the pure_random agent. Congrats, your RL agent learned the optimal strategy!

antoniodecinque99 / Computational-Intelligence-Course-Reflections

Lab 3 review #7

Task 1

Task 2

Task 3

Task 4