Player process 1.0 - Githubissues

justheuristic commented 7 years ago

A process that

compiles the agent
takes the stored dqn params every once in a while
- if there are no params, put them there
interacts :) (cpu)
stores the interactions in the database
all interactions are of same length (e.g. 10 ticks)
loads new weights every once in a while
does not break down everything if restarted at random point of time
does not erase old sessions
does not intentionally break down if there are several such processes running in parallel :)

Basically do that but with an agent and all the trouble that comes with it. Example agent setup for mountaincar.

justheuristic commented 7 years ago

Okay, so we gonna have experiment description like this minimal example and hopefully somewhat less crutchy.

There's some skeleton of the base player process you may want to start from:

https://github.com/yandexdataschool/tinyverse/blob/master/player.py
how to run it
see the gpu server location in gitter

Currently it only plays the base game with current weights, does not try to load new ones, does not log anything and probably may become much quicker.

I suggest starting with loading new weights once in a while (you may need a parameter for this).

Btw pls write something if you're still there :)

justheuristic commented 7 years ago

Merged https://github.com/yandexdataschool/tinyverse/commit/c92a8d0bce48a7c33aa74633028138fc9059d290

yandexdataschool / tinyverse

Player process 1.0 #3