Closed justheuristic closed 7 years ago
Okay, so we gonna have experiment description like this minimal example and hopefully somewhat less crutchy.
There's some skeleton of the base player process you may want to start from:
Currently it only plays the base game with current weights, does not try to load new ones, does not log anything and probably may become much quicker.
I suggest starting with loading new weights once in a while (you may need a parameter for this).
Btw pls write something if you're still there :)
A process that
Basically do that but with an agent and all the trouble that comes with it. Example agent setup for mountaincar.