6.2
Physical Robot Experiments
We also test CSF with a neural network policy on a physical da Vinci Surgical Robot (dVRK) [32] to
evaluate its performance on multi-goal tasks where the end effector must be controlled to desired
positions in the workspace. We evaluate the CSF learner/supervisor and PETS on the physical robot
for both single and double arm versions of this task, and find that the CSF learner is able to track the
PETS supervisor effectively (Figure 2) and provide up to a 22x speedup in policy query time (Table
1). We expect the CSF learner to demonstrate significantly greater speedups relative to standard deep
MBRL for higher dimensional tasks and for systems where higher-frequency commands are possible.
On-Policy Robot Imitation Learning from a Converging Supervisor
arXiv: https://arxiv.org/abs/1907.03423 citations: https://scholar.google.com/scholar?oi=bibs&hl=en&cites=17339703262979218293 authors: