Open benelot opened 6 years ago
Agree. I am also looking for the implementations of a2c algorithm in MuJoCo envs. It would be very helpful if someone can share the pre-trained models.
It also seems like the baselines algorithms are not compatible with action spaces that are not defined by integers (i.e. continuous space in the FetchReach robot)
@jeremyf21 the most of the baselines algorithms are applicable to both continuous and discrete action spaces; but policies (such as CnnPolicy, MlpPolicy etc) themselves may not be. This is partially addressed in this PR: https://github.com/openai/baselines/pull/385/files which makes policies in ppo2 submodule compatible with continuous action space (gym.spaces.Box), and with discrete action space (gym.spaces.Discrete). We have not implemented similar logic for multi-discrete action spaces, but it is coming up. Back to the original question though - I think it's a good idea to publish those if we still have them somewhere, but I am not sure we'll publish them specifically in baselines repo. I'll post here if I find out more.
As a part of the code quality improvement effort, it was decided (putting @joschu in the loop) to include the hyperparameters and document all tips and tricks needed to reproduce state-of-the-art results, but not include the trained models themselves, at least not in the first pass before we figure out good APIs (because as it stands at the moment different policy / model apis make transfer learning experiments difficult).
Hello,
I think I would be very helpful to have agents that are pre-trained on different gym environments. I am working on some transfer learning examples and it could be very helpful to have some baselines. Does anyone have pre-trained agents to experiment with?