rail-berkeley / softlearning

Softlearning is a reinforcement learning framework for training maximum entropy policies in continuous domains. Includes the official implementation of the Soft Actor-Critic algorithm.
https://sites.google.com/view/sac-and-applications
Other
1.2k stars 239 forks source link

Progress on Deepmind control suite? #69

Closed quanvuong closed 5 years ago

quanvuong commented 5 years ago

Hi!

I noticed that a recent commit was pushed to the repo to support running SAC on the Deepmind control suite.

I was wondering if the current code base is ready to run on the Deepmind control suite and if not, what else remains to be done? Maybe I could help. Thanks!

hartikainen commented 5 years ago

The control suite is actually ready to run already! Just follow the instructions in the main README and replace the universe, domain, and task with the dm_control suitable values, for example: --universe=dm_control --domain=cartpole --task=swingup. Let me know, or feel free to reopen this issue if there's anything that's not working as expected.

quanvuong commented 5 years ago

The code exits with error when I tried running with humanoid run task because of this line https://github.com/rail-berkeley/softlearning/blob/fb70abc6a43a0ef794c3df7e744500d58bca9416/softlearning/environments/adapters/dm_control_adapter.py#L123

The reason is that the 'head_height' attribute of the observation has 0 dimension, so calling np.cat would complain that all the items in the concatenating tuple has to have the same dimension.

hartikainen commented 5 years ago

Oh, interesting. Thanks for reporting this! I opened another issue about this case. I'll try to take a look at it at some point. Or if you happen to find a reasonable solution in the meanwhile, feel free to submit a PR :)