Closed ryanjulian closed 6 years ago
From Junchao
dm_control does not have a viewer for viewing training process, but it gives the image pixels of the results. I try to re-use Pyglet that openai gym uses to show this pixels, but it seems it doesn't work.
I pose the code below. When I call "render", the images should be showed on a window. However, if I use "switch to" to switch the buffer on the back in render functioin, some global parameters seem to be changed. The results become zero after "step()" call in the main loop. If I remove the "switch to" function, the results are correct, but the images could not be showed.
I've never used pyglet before, but I have a theory.
dm_control renders its frames in an off-screen OpenGL context. So it has a hidden OpenGL window and returns the rendered pixels from that.
pyglet also uses OpenGL. Your Window has an associated OpenGL Context.
OpenGL has the notion of a rendering context. In OpenGL, only one rendering context can be active at once on the system. In order to render to the user, pyglet needs its context to be active. Similarly, in order to render the scene off-screen, dm_control also needs its context to be active.
Your Window has an OpenGL context. When you call window.switch_to()
, pyglet changes the system's OpenGL context to its own. If dm_control does not set its own context to current before it tries to render the next frame, then the rendering may be corrupted.
Programs which use OpenGL should always set their own context to current before rendering, but I have noticed a common bug in RL visualizers where the author forgets to do this.
I will look at the dm_control
codebase to see if they have this bug.
Some suggestions to solve your problem:
Here is relevant code in dm_control
https://github.com/deepmind/dm_control/blob/master/dm_control/render/glfw_renderer.py#L65 https://github.com/deepmind/dm_control/blob/master/dm_control/mujoco/engine.py#L406 https://github.com/deepmind/dm_control/blob/master/dm_control/mujoco/engine.py#L412 https://github.com/deepmind/dm_control/blob/master/dm_control/mujoco/engine.py#L560
It seems to me that dm_control
is probably handling the context switch properly, as long you are using the latest version from their Github repository. A previous version did not handle it properly
Are you using the latest version of dm_control from Github? If so, I am not sure why it's broken, but you could save the current (e.g. dm_control) context before you render, and then restore it after you render using pyglet.gl
. This is the same thing dm_control
does to avoid bugs.
Other notes:
env._physics
because Environment
already has a public property physics
I use pygame instand and it works. Thanks for help.
However, I get some trouble building the rllab project. When I import the env.base, some files seem to be missing.
The base.py will import cached_property, but I could not find this file or directory.
Is there any file missing?
Did you setup rllab using conda, as described here? It is difficult get all the dependencies right using pip.
Sorry, my mistake. I though it was a library of rllab. I am using conda, but this library actually was not installed. I have used pip to install it correctly.
It seems all is done. Should I push my code to the integration repository and you can have a review? But it seems that I have no permission for this repository.
That is really odd. It should definitely have been installed per environment.yml
Can you file an issue with steps to reproduce?
From above:
You can find examples of how to launch rllab in examples and sandbox/rocky/tf/launchers. Note that everything must run using the run_experiment_lite wrapper.
For example, in trpo_gym_tf_cartpole.py I should be able to replace
env = TfEnv(normalize(GymEnv("CartPole-v0", force_reset=True)))
with
env = TfEnv(normalize(DmControlEnv(domain_name="cartpole", task_name="swingup")))
and see a 3D plot of the cartpole. It should also still train the cartpole to swing up :).
To submit your code, upload it to your own fork and and then a pull request to the integration
branch of this repository. See https://help.github.com/articles/creating-a-pull-request/
Thanks, it is a good case to check. And I found a problem when I use normalize class.
The normalize class uses the flat_dim and shape to normalize the data. For gym, the observation and action have their own space, an discrete array or a metrix box. However, for dm_control, the observation space consists of three array(position, velocity, rgb), and the action space is an array with min-max value. It seems hard to convert these two spaces.
I am not sure how to convert these two space so that I can reuse the normalize class.
Do you have any suggestion?
Remember that the interface you are implementing is the rllab.envs.base.Env interface, not the gym.Env interface. Take a look at other classes implementing rllab.envs.base.Env. There are many of them. Most of them use spaces.Box for observation and action spaces. Take a look at how they use spaces.
rllab has no notion of labeled subspaces like dm_control does, so it's sufficient to just concatenate them into one larger space for rllab. We do not have image support yet, so you can ignore the RGB subspace for now.
Did that answer your question?
Yes, I do have checked some other env codes. The other env hard code the observation space dim except for the Gym. I think Gym is much similar with dm_control, so I just gave an example.
Since I think the observation space is used for training, I am not sure which type of data it should be returned. I just want to make sure that the algorithm is correct and do not raise some ambiguous meaning.
Now I will concatenate all the data, so that it would not be modified if the image is supported.
But I still have a question, how do you clarify the three types of data(position, velocity, rgb) without any shape provided? I just return a large space without any information. As you said, I can ignore the rgb data at this time. However, if rllab supports image training in the future, how do you remove the rgb data from the space to train the old model?
Thanks.
I have pushed a pull request, please have a check.
Fixed in #48
DeepMind have released a set of RL environments called dm_control. We would like to use these enviroments in rllab.
This task is to add
dm_control
to the rllabconda
environment, and implement a class (similar toGymEnv
, e.g.DmControlEnv
) which allows any rllab algorithm to learn against dm_control environments. You will also need to implement the plot interface for dm_control, which shows the user a 3D animation of the environment.This is conceptually the same as GymEnv, which allows rllab users to import any OpenAI Gym environment and learn against them.
Consider this a professional software engineering task, and provide a high-quality solution which does not break existing users, minimizes change, and is stable. Please always use PEP8 style in your code, and format it using YAPF (with the PEP8 setting). Submit your pull request against the
integration
branch of this repository.Some notes:
run_experiment_lite
wrapper.rllab/
. The treesandbox/rocky/tf
re-implements classes from the original tree using TensorFlow, and is backwards-compatible with the Theano tree. We are working towards using only one NN library soon, but for now your implementation needs to work in both trees.