FluxML / Gym.jl

Gym environments in Julia
MIT License
54 stars 19 forks source link

Support for generic rendering #9

Closed darsnack closed 5 years ago

darsnack commented 5 years ago

Currently, the environments provide a simple state space. For example, the cart pole environment provides a position and angle. This is fine and appropriate for getting up and running, but any real environment (e.g. Atari) will not provide such precisely encoded information about the current state.

Would you be amenable to returning raw RGB arrays as a rendering "mode"? So, instead of using WebIO, I can choose to render in RGB array mode. In which case, the environment returns an RGB array representing the current game screen to me, and I can choose to plot that or not. I feel that we should develop the infrastructure for this model of rendering now with the simpler environments. This way it is ready as we move to more complex environments that require such support.

Moreover, I believe the Plots package is fairly powerful, so we should encourage users to utilize that for displaying animations. WebIO is great, but more difficult to get working in varying user setups. Plots would be more straight forward and require less up-keep/debug for the maintainers of this repo.

tejank10 commented 5 years ago

It'd be great to have both WebIO and Plots based renderings. As you pointed out, having a mode to specify in the render function would be awesome! Also, there's some development going on around RayTracer.jl. If we are able to use this for rendering then it can make the whole pipeline differentiable.

darsnack commented 5 years ago

Yes that would be most appropriate long term for 3D environments. RGB arrays should be sufficient for 2D environments, which I can start working on.

kraftpunk97-zz commented 5 years ago

@darsnack I'm a little uncertain about Plots.jl. I went through the tutorials and I'm not entirely sure if that's a good idea, but that may just be my inexperience. I was leaning towards the use of Gtk.jl and Luxor.jl. What are your thoughts?

darsnack commented 5 years ago

Gtk.jl would be if we wanted to render windows containing environments, right? My thoughts (for now) are that displaying the environment is the least performant part of any RL program. We could add Gtk in addition to WebIO for rendering, but I want to have an option where nothing is "rendered." Instead, we simply create the RGB array that represents the view of the environment and pass that to the user. The user can then display this RGB array in a window/plot/etc. as they choose. More importantly, they can choose to only update the window/plot/etc. at whatever rate is suitable to them. This way, training can be made faster for simple 2D environments.

For 3D environments, it isn't feasible to create a view of the environment without actually rendering it. So, we'd probably need to use something like RayTracer.jl. For windowing, Gtk.jl would be appropriate here, and I agree that Plots.jl is probably not the correct solution. I don't want to add Plots.jl code to this package. I just want to provide users with a "roll-your-own" rendering option.

Out of curiosity, what didn't you like about Plots.jl? It supports similar syntax to matplotlib, and most researchers are extremely familiar with that plotting syntax. I've toyed with the idea of adding plotting recipes to environments so users could just call plot(env) and leverage the built-in animation tools in Plots.jl. This would be in addition to the RGB array solution.

kraftpunk97-zz commented 5 years ago

Computer graphics isn't my forte, so I may be wrong about this, but I felt that the support for animations was somewhat weak. Plus, Plots.jl is a graph-plotting library, not a computer graphics library, isn't it? Seems like wrong tool for the job.

I also wanted to make sure that I understood what you're suggesting here. Correct me if go wrong, but are you suggesting that we have multiple modes available, one of them being an RGB array representation of what the human would see, if the render function is called?

EDIT - I have updated my comment above. Gtk.jl is for interactive graphics, which will probably be overkill for our case. I think I'm more interested in this another package called Luxor.jl, which runs on Cairo.

darsnack commented 5 years ago

In any RL library, whenever render is called, a window pops up displaying a view of the environment. In the case of cart pole, for example, you see the front-facing view of the pole balanced on the cart. This window is just showing you a single frame, which is just an RGB array. Since invoking the windowing library (Gtk, WebIO, etc.) is usually much slower than the rest of the code, I suggest we have a mode optional arg to render that allows the user to choose "RGB mode." In this mode, instead of invoking a window to display the view of the environment, we simply return the RGB array to the user. The user can then display that array (or not) as they please.

I haven't had the chance to get into the weeds of implementing this option yet. And you are right that Plots.jl isn't the right tool for the job. Looking at Luxor.jl, I think it might be a promising route. We could use it render to an image buffer, then pass that out as an RGB array. The same code could also be leveraged to have a "windowed" mode that displays to a Quartz or Xlib window.

~Another option is Makie.jl. It might be overkill since we don't need interactivity, but with GPU support and the option precompile, it might be more performant.~ Makie would really just be a swap-in for Plots, so I take back what I said.

gpgjoe commented 5 years ago

I was gonna open an issue, but then I found this discussion...

What is currently the status on getting any animations at all? For me, running the README code on Juno fails on this line: display(ctx.s):

UndefVarError: HandlerFunction not defined
in top-level scope at base/none
in display at base/multimedia.jl:287
in display at Atom/W03fL/src/display/showdisplay.jl:102
in displayinplotpane at Atom/W03fL/src/display/showdisplay.jl:41
in show at base/multimedia.jl:79
in show at Atom/W03fL/src/display/webio.jl:68
in setup_server at base/none 
in #WebIOServer#86 at WebIO/iI6jE/src/providers/generic_http.jl:88
darsnack commented 5 years ago

@gpgjoe This looks like an error with your WebIO setup. I can confirm that it worked for me in a Jupyter Notebook, but I haven’t tried it in Juno. The WebIO.jl Github page doesn’t mention any specific instructions for Juno, so I imagine it would work right out of the box.

As an aside, these are exactly the issues I’m trying to address with an RGB mode. I want a rendering mode that isn’t so hard to setup like WebIO. I wanted to stay away from opening Quartz or Xlib windows, since it is slow, but it seems that Cairo is capable of drawing directly to an image buffer sans window, which is exactly the simplicity and performance we want.

kraftpunk97-zz commented 5 years ago

@gpgjoe Yeah, I have encountered this error. It's one of the issues that I'm working on currently. Right now, instead of using display(), ypu can just use Blink, as demonstrated in the package Readme.md.

@darsnack Yes, Cairo is infact the way to go. I initially expressed interest in Luxor, but after talking to the owner of that package, it was made clear that due to lack of interactivity, Luxor is also not fit for what we want to achieve. Luckily, Luxor runs on top of Cairo, and we also have Cairo.jl, which brings Cairo to Julia; so all doors are not yet closed. Hopefully, Cairo.jl will be the answer to our problem.

kraftpunk97-zz commented 5 years ago

I think we should have Cairo based rendering within the next week.

darsnack commented 5 years ago

Let's say Cairo rendering framework and a working example with the cart pole. I'd like to submit the commits within the week, and we can discuss whether we are satisfied with the approach. If so, we can move to render all existing environments with Cairo.

kraftpunk97-zz commented 5 years ago

Sounds like a plan.

kraftpunk97-zz commented 5 years ago

@darsnack @tejank10 I have cooked up something for rendering the CartPole environment using Cairo. I haven't really gotten far into RL, so except for checking the rendered against custom environment states, I tested the renderer against the DQN implementation for solving CartPole, which is included in the examples directory of the package. Here is a little demonstration for the same. A few things,

EDIT - All the code is in the rendering branch of my fork of Gym.jl. I ask as many people all to go through it and provide me with suggestions and pointers (especially about increasing the frame rate), so that we can finalise these basic implementation details and then move on to build the renderer for other environments.

darsnack commented 5 years ago

This is great! Can you comment with a gist of the code so we can play around with it too?

kraftpunk97-zz commented 5 years ago

Sure. I think I forgot to mention it in the post above, so I'll do it here. All of it is in the rendering branch of my fork of Gym.jl. I was hoping someone could look through it and maybe give me performance pointers, and the varipus ways I can test it, because I don't think CI will be able to do that.

darsnack commented 5 years ago

The "sticking" of the rendering after the first episode isn't related to the rendering. If you run DQN.jl with no rendering, there will still be a pause after the first episode. Perhaps something to do with DataStructures or Flux? Anyways, it runs fast after the initial stick, so I don't think this a performance issue to worry about.

I submitted a PR that implements what you had but within the Ctx framework that already exists within the package. Also added the RGB mode.