FluxML / Gym.jl

Gym environments in Julia
MIT License
54 stars 19 forks source link

Adding support for Cairo and RGB rendering #18

Closed darsnack closed 5 years ago

darsnack commented 5 years ago

Fixes #9

Additions:

  1. Took some of the great work by @kraftpunk97 and integrated it into the package in a way that mimics what we already have for WebIO rendering
  2. Added support for "RGB mode" rendering which uses Cairo under the hood, but returns a raw RGB array to the user
darsnack commented 5 years ago

I've tested this with Jupyter Lab

kraftpunk97-zz commented 5 years ago

@darsnack Painting in Juno's plot pane is fine, but how do you render when a person is not using Juno?

I suppose we can add another view mode, where if Juno is not running, then we can use a windowing library (I had Gtk.jl in mind) to call a window and render into that.

darsnack commented 5 years ago

The way I designed it is that no matter what mode you use, the user calls display(render(env, ctx)) and it shows up. In a generic user environment (let’s assume they run from the terminal), could we had a helper function that sets up Gtk so that display paints to that window?

kraftpunk97-zz commented 5 years ago

@darsnack That is what I had in mind. Gtk is a GUI toolkit. The way I imagined it, maybe there's a way a Julia session can detect if it is running in Juno or via terminal. We can then devise a smart system that automatically chooses if it should display in the plot pane, or in a Gtk-created window. The user can also be explicit about whether they want to render in Juno's plot pane or if they want a Gtk window, if they are already using Juno.

One more thing, when I was looking for Cairo manuals, I came across this article. It seems that setting up animations with Gtk may be slightly more complicated than what we've achieved here and may also require some multithreading. What do you think?

darsnack commented 5 years ago

Going directly to ccall and following that article might get messy with multithreading.

This is what Gtk.jl has in terms of multithreading.

Automatic rendering to windows should be a separate issue and PR. I think it is more important to apply what we have now to other environments and to introduce more environments to the package. These issues are a greater stopping point for the package’s usefulness than automated windowed rendering.

kraftpunk97-zz commented 5 years ago

True true. Can't have us worrying about the color of the paint, before we've started building the wall.

darsnack commented 5 years ago

How do I update the Project.toml to reflect the addition of Cairo as a dependency? I assume we don't manually write in the commit hash.

kraftpunk97-zz commented 5 years ago

When I included the Spaces, I just updated it manually.

darsnack commented 5 years ago

Updated to reflect new project structure. Dependencies that I added are now in Project.toml.

I suggest we merge this PR, then issue separate ones to create rendering for the other environments. Then we can delegate the work for different environments to different contributors (and get the results faster).

kraftpunk97-zz commented 5 years ago

Hey @darsnack. I am working on getting the windowed rendering up and running using Gtk.jl. Between this, a few other features I'm working on, and preparing for my upcoming tests, I'm spread a little too thin. I'll be away for a week, and getting the functionality running should not take more than 3 days after that. In favor of shipping a complete feature at a time, I was hoping we could keep this PR in limbo for some time and when the changes have been made, then we ship it? What do you think? Or if you've had the chance to experiment with Gtk.jl, you can implement the feature yourself?

darsnack commented 5 years ago

That's fine. I'll start by porting the rendering to other environments. Then I'll take a look at Gtk.jl. We can ship later if I don't get it implemented by then.

I am also working on a couple different research deadlines right now, so this isn't my priority.

kraftpunk97-zz commented 5 years ago

Hey @darsnack. As promised, I've added the functionality for rendering in a GUI window for the CartPole Environment. All changes are present in the rendering branch of my fork of the repository, and I ask you to check it out whenever convenient. Here is a demo.. A few things of note...

If this is to your satisfaction, then please incorporate it into your PR and we should be good to add more of the classic control environments. Let's get this done!

darsnack commented 5 years ago

I incorporated your changes into the PR. Thanks for doing this work.

Couple notes:

kraftpunk97-zz commented 5 years ago

Nice. Although, do we really need WebIO at this point? Gtk is far superior and way better. If @tejank10 approves, I think we can phase out the use of WebIO in this package.

darsnack commented 5 years ago

Yeah I see no point to WebIO either. It would clean up the package too (no more assets).

darsnack commented 5 years ago

Added rendering for PendulumEnv as well.

I found a bug with the step! logic for Pendulum, which I documented in #25.

kraftpunk97-zz commented 5 years ago

~9q¹1~

Please ignore. This message is a mistype.

darsnack commented 5 years ago

I don’t think I know what that means

darsnack commented 5 years ago

@tejank10 any thoughts on this PR?

kraftpunk97-zz commented 5 years ago

Hey @darsnack , sorry for the really late response; I am in the middle of my exams and had to look after other things. I just had the opportunity to look at your work and try it out. Just a few suggestions...

Thanks for the great work!

darsnack commented 5 years ago

I agree with everything you said, but I am going to push back on not creating a Ctx object. Not all environments require rendering, and I don't think the env itself should carry around the rendering objects. This also gives us the flexibility with environments that do require rendering as part of step! to use whatever context we want under the hood. The user can separately instantiate their own context if they wish to display the environment.

With respect to drawcanvas going inside the environment file, my personal preference is to keep it separate. include("vis/cartpole.jl") in cleaner to me in terms of code organization.

kraftpunk97-zz commented 5 years ago

I see your point. What if we set the context as a part of the environment itself, and then define a utility function, something like rendermode!(env::AbstractEnv, mode::Symbol) to change context, which can be a ctx field in the definition of the environment? Can that serve as middle ground?

kraftpunk97-zz commented 5 years ago

By default, the ctx can be set to, say, :nothing or :human-pane and then if we want we can change the context by calling the utility func. If an environment doesn't require rendering, it simply won't have a ctx field

darsnack commented 5 years ago

I am thinking something like that if the env requires rendering. But I don't see the need to pass around a data structure that isn't needed for most functions related to an environment. If I need to call rendermode! to choose my context, then isn't that tantamount to calling Ctx? Either way I need a line of code to set my context.

Maybe a better middle ground is to change make so that it accepts the env name and a rendering mode. It returns the env and the ctx, and we can make the rendering mode an optional parameter?

kraftpunk97-zz commented 5 years ago

Yes. I can agree to that. The EnvWrapper can hold the ctx object. That is fine by me.

kraftpunk97-zz commented 5 years ago

I do have one question, why do you feel that the environment shouldn't carry the context object?

darsnack commented 5 years ago

The context object is only related to rendering in my view. Running the environment and rendering it are two orthogonal problems to me.

Take for example, the CartPole. On a headless server, I should be able to run my code without ever needing to render the environment. But if the code tries to make a Gtk window set as the context object, then my program crashes because I have no display. So, we could make the default context something that doesn't crash (e.g. Cairo/RGB), but what if changes to Cairo depend on a display and result in crashes in the future. Setting a default context that works is a bandaid; the real issue is that rendering and simulating the environment should be separated unless they need to be merged (e.g. Atari).

I come from a functional background, so this is also a question of codebase design. A contributor should be able to make changes to render! without breaking step!. But if both use the same env object that contains my context, then that invites a contributor to use fields in the data structure meant for rendering to do the simulation. Now, it's more likely that if I change something related to rendering, the whole environment breaks. This puts more burden on the test code to identify all these corner cases of overlap. I could have just avoided the issue by keeping the rendering functions and data structures separate from the simulation functions and data structures.

Personally, I wouldn't have the context within the EnvWrapper either. Keeping it in the wrapper isolates possible code issues like the one above, so I think it is a good compromise. But I think graphics is one of those things that never "just works." Forcing people to call Ctx explicitly makes them think about whether they have the correct dependencies installed for rendering. This is a subjective reason of course, so that's why I didn't push it too hard 😄.

kraftpunk97-zz commented 5 years ago

I understand. I'll admit I am a little biased to the idea of the context being with the environment definition, because that is how they've done it in the python gym and we wanted to make the interface as close to theirs. It also made sense for everything that pertains to an environment to be together in one file, so that the user isn't jumping from one place to another when tracking the flow of control. But your reasoning makes sense as well, and I'm okay with it being either way. Let's have @tejank10 review this PR bring rendering to Gym.jl!

darsnack commented 5 years ago

I think my latest changes bring the package closer to the OpenAI Gym Python interface (e.g. render! no longer requires a context to be passed in), but it does this by isolating the context to the EnvWrapper. I think we get the best of both worlds this way.

With respect to everything being in the same file, I have no strong preference and will implement whatever is the final consensus.

tejank10 commented 5 years ago

Awesome work @darsnack! I am fine with your decision of having Ctx in the EnvWrapper. Let environment deal with only controlling the state. EnvWrapper is made to track everything else pertaining to the environment, at the same not losing the OpenAI Gym interface.

tejank10 commented 5 years ago

Thanks for great work @darsnack and @kraftpunk97 for valuable inputs!

kraftpunk97-zz commented 5 years ago

Running the environment and rendering it are two orthogonal problems to me. Take for example, the CartPole. On a headless server, I should be able to run my code without ever needing to render the environment. But if the code tries to make a Gtk window set as the context object, then my program crashes because I have no display.

I just got to experience first hand what you meant by this, @darsnack. I am prepping my first report for the JSoC project right now, and I am considering adding a demo. I am preparing the report as a nextjournal.com article, which is very similar to a Jupyter notebook. When I was writing a short example that uses this package, I could install the package and its dependencies, but when it was time to execute using Gym, I couldn't do so because there's no output device and Gtk.jl is refusing connection. And I remembered us having this talk. I understand what you meant back then. Right now, I'm going to create a new branch that doesn't use the Gtk.jl dependency and run the examples off of that, but it does bring a design flaw into light , so thanks for the explanation you gave back then. You totally called it, and I will try to fix it and keep it in mind moving forward...

darsnack commented 5 years ago

Yeah, one simple solution is to add to the docs that you can use xvfb-run julia [1] to get a virtual display. Haven’t tested this though, and I don’t know if it is cross-platform.

[1] https://linux.die.net/man/1/xvfb

darsnack commented 5 years ago

Though I guess it isn't an issue if it is Linux-only. There are no headless macOS environments, and I don't know about Windows, but I suspect not.