laermannjan / nip-deeprl-project

Student project in deep reinforcement learning with the OpenAI Gym. We evaluated and analyzed how different model architectures performed as agents in various games.
0 stars 0 forks source link

Save (pickle) agent with highest reward rating #7

Closed laermannjan closed 7 years ago

laermannjan commented 7 years ago

Save (pickle) agent with highest reward rating in addition to the regular one at the 'end of training'. Thereby we could investigate questions such as if the reward (even as mean over past episodes) can give a qualitative indication to the agent's performance. It might be of interest to see how the 'best agent' competes against the one from the (arbitrary) end of the training in a test environment (one without learning or exploration). This would probably be done in a qualitative relatively subjective way where we could try to examine complexity of strategies or the similarity to human-strategies (e.g. by playing it on our own with play.py)

laermannjan commented 7 years ago

Done in de979493ea4d84c23039ff87ca1623fe463a1d24