yandexdataschool / tinyverse

Universe RL trainer platform. Simple. Supple. Scalable.
10 stars 5 forks source link

Player process 1.0 #3

Closed justheuristic closed 7 years ago

justheuristic commented 7 years ago

A process that

Basically do that but with an agent and all the trouble that comes with it. Example agent setup for mountaincar.

justheuristic commented 7 years ago

Okay, so we gonna have experiment description like this minimal example and hopefully somewhat less crutchy.

There's some skeleton of the base player process you may want to start from:

Currently it only plays the base game with current weights, does not try to load new ones, does not log anything and probably may become much quicker.

I suggest starting with loading new weights once in a while (you may need a parameter for this).

Btw pls write something if you're still there :)

justheuristic commented 7 years ago

Merged https://github.com/yandexdataschool/tinyverse/commit/c92a8d0bce48a7c33aa74633028138fc9059d290