nottombrown / rl-teacher

Code for Deep RL from Human Preferences [Christiano et al]. Plus a webapp for collecting human feedback
MIT License
559 stars 95 forks source link

Changes to webapp to allow debugging of upload problems and providing feedback out of sequence #39

Closed mixuala closed 6 years ago

mixuala commented 6 years ago

I'm still having problems getting the agent to complete a learning task from human feedback. right now I'm stuck on https://github.com/nottombrown/rl-teacher/issues/38 which keeps me from offering feedback after about 3000 sec.

But I made a few changes to the webapp to make it easier to debug uploads from the webapp, and also to provide feedback out of sequence.

mixuala commented 6 years ago

I added a new args param to serve videos from the local server. By skipping the Google Cloud uploads, I was able to complete a training run with 600 feedbacks before my video rendering processes died.

nottombrown commented 6 years ago

Hey @mixuala, I'd like to keep this version of the code simple and not add new features to it.

However, I'd bet that many people would find your code useful to them. Perhaps you could put together a description of the fork that you're working on, and then we can link to it to the Extensions section of the README