record human demonstration and use it in RL

paive commented 7 years ago

Hi: I try to use human input to control agent, and I want to record this progress which includes the state of environment, action of agent, and reward. What should i do with deempind lab?

ghost commented 7 years ago

You can do this pretty straight-forwardly if you use one of the API interfaces for DeepMind Lab. Just build some interface that a human can use to navigate within the environment. At each step, your human agent will output an action vector (the 7-element ndarray detailed in the /docs directory); from the environment, you'll receive a state representation (with its contents based on how you initialized the environment, as you can supply the observation type) and a reward. All you need to do at that point is just store this information (and probably the time-step or just step) somewhere as you receive it- shouldn't have any timing issues here (unless you use a really slow storage method) as odds are you'll only have one process running at once since you've got a human agent navigating the environment.

Without knowing the specifics of your imitation learning/inverse RL/etc. model, I'd venture that this would be sufficient data for your model to learn from.

tkoeppe commented 6 years ago

knyte's answer sounds reasonable. You can use the API to create a human interface and also save a trace of actions and rewards.

google-deepmind / lab

record human demonstration and use it in RL #47