As I put in the title, I was wondering if there is a simple way of having one of the agents just playing as best as possible (i.e. taking always the most likely action), with no exploration (and even no learning). This would be useful to check the real skill level that the agent has achieved so far.
As I put in the title, I was wondering if there is a simple way of having one of the agents just playing as best as possible (i.e. taking always the most likely action), with no exploration (and even no learning). This would be useful to check the real skill level that the agent has achieved so far.
Thanks!