On-Policy Robot Imitation Learning from a Converging Supervisor

arXiv: https://arxiv.org/abs/1907.03423 citations: https://scholar.google.com/scholar?oi=bibs&hl=en&cites=17339703262979218293 authors:

http://www.jonathannlee.com
https://abalakrishna123.github.io

6.2 Physical Robot Experiments We also test CSF with a neural network policy on a physical da Vinci Surgical Robot (dVRK) [32] to evaluate its performance on multi-goal tasks where the end effector must be controlled to desired positions in the workspace. We evaluate the CSF learner/supervisor and PETS on the physical robot for both single and double arm versions of this task, and find that the CSF learner is able to track the PETS supervisor effectively (Figure 2) and provide up to a 22x speedup in policy query time (Table 1). We expect the CSF learner to demonstrate significantly greater speedups relative to standard deep MBRL for higher dimensional tasks and for systems where higher-frequency commands are possible.

mxochicale / exploring

On-Policy Robot Imitation Learning from a Converging Supervisor #71

On-Policy Robot Imitation Learning from a Converging Supervisor