google / dopamine

Dopamine is a research framework for fast prototyping of reinforcement learning algorithms.
https://github.com/google/dopamine
Apache License 2.0
10.45k stars 1.37k forks source link

Dopamine duplicate Gym functionality #82

Open jarlva opened 5 years ago

jarlva commented 5 years ago

I spent a lot of time trying to understand the colab cartpole Gym example so to apply to a custom discrete Gym environment, which is similar to the cartpole gym environment and works fine with a Keras RL agent. I noticed that dopamine is using gym_lib.py in addition to the actual gym environment. For example, gym_lib.py contains variables already defined in the gym cartpole environment file. such as:

dopamine/discrete_domains/gym_lib.py:

CARTPOLE_MIN_VALS = np.array([-2.4, -5., -math.pi/12., -math.pi*2.])
CARTPOLE_MAX_VALS = np.array([2.4, 5., math.pi/12., math.pi*2.])
gin.constant('gym_lib.CARTPOLE_OBSERVATION_SHAPE', (4, 1))
gin.constant('gym_lib.CARTPOLE_OBSERVATION_DTYPE', tf.float32)

gym/envs/classic_control/cartpole.py:

self.x_threshold = 2.4
self.observation_space = spaces.Box(-high, high, dtype=np.float32)

This is confusing. OpenaAI gym encompass all the code necessary to create a complete environment object. With all the necessary plumbing//functions/variables, etc.. This approach is difficult to understand and build on. Would it be possible to make gym a drop-in for dopamine? This would greatly simplify and speed up dopamine adoption.

psc-g commented 5 years ago

thanks for your feedback! when we originally built dopamine for atari, we wrote the AtariPreprocessing class to have everything related to atari preprocessing contained in one place. there are a few things that are standard when training atari agents (see https://github.com/google/dopamine/blob/master/dopamine/discrete_domains/atari_lib.py#L240)

for consistency with that we created GymPreprocessing. we'll see if we can simplify the interface with non-atari gym environments so that it's easier to use.

jarlva commented 5 years ago

That would be very helpful. Thanks!

jarlva commented 5 years ago

Hello, is there is a plan to remove the duplicate settings to make it more streamlined and easy to use with GYM?

fbbfnc commented 5 years ago

I'm doing the same thing as @jheffez and i find difficult to adapt the code with all these duplicate settings. I've successfully adapted my custom discrete gym environment into the rllib library and it works fine; now i'm trying to do the same for dopamine, struggling a little bit.

mgbellemare commented 5 years ago

Echoing @psc-g 's earlier comment, I agree GymProcessing needs some streamlining. This is on our plate but -- if you have a solution ready for GymProcessing in particular, a PR is also welcome.

jarlva commented 5 years ago

@mgbellemare , thanks for the update! Ideally, GymProcessing will make Dopamine a drop-in to Gym.

Can you please share with us your roadmap/plans? For example, adding latest RL technics (like Simple) and if Dopamine and tensorflow/agents can benefit from synergy?

psc-g commented 5 years ago

hi, i've started looking into this. a few points:

  1. we could inherit things like min_vals and max_vals from gym to avoid the redundancy. is that mostly what you're after?
  2. the observation shapes and dtypes are a little trickier, as they need to be defined as gin constants so that we can inject them via the respective gin config files. i think there's a way to get this to fetch the values from gym, but it would add redirection, potentially at the risk of clarity.
  3. with regards to roadmap/plans, we have lots we'd like to do, but limited time :)
  4. we are in discussions with tensorflow/agents to see how dopamine and tf/agents could be more compatible, stay tuned!
iteachmachines commented 5 years ago

Goodluck i am always with dopamine and will try my best to contribute too someday.

jarlva commented 5 years ago

Hi @psc-g , 1, 2. OpenAi Gym exposes all functions. For ex, spaces is easily accessible via env.env.action_space. To get the number of actions:

import gym from gym import spaces from gym.utils import seeding

action_dim = env.env.action_space.n

To get state: state_dim = env.get_state_size() Take a look at this for more examples. So, it's beneficial, and simple, for Dopamine to grab all that good stuff directly from Gym.

  1. Do share...
  2. Great! Synergies are a good thing!