Suggestion to add how to implement pre-trained policies.

rlworkgroup / garage

A toolkit for reproducible reinforcement learning research.

MIT License

1.84k stars 309 forks source link

I would recommend comparing the observation distribution between your real-world environment and simulated environment. The most obvious difficulties in sim2real transfer are due to that mismatch. Note that the docs already describe how to use a pre-trained policy. Continuing to train after transferring from sim2real is an active area of research, and I don't have a firm recommendation for how to achieve it. In particular, a Q function describing a policy's behavior in simulation is likely to over-estimate the performance of that policy on the real environment.

rlworkgroup / garage

Suggestion to add how to implement pre-trained policies. #2325