ray-project / ray

Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
https://ray.io
Apache License 2.0
33.51k stars 5.69k forks source link

[RLlib] Include rainbow DQN example code #7035

Closed jkterry1 closed 4 years ago

jkterry1 commented 4 years ago

The documentation mentions that rainbow DQNs can be run in RLlib, though not all settings for it are on by default. However, no where does it say what settings to change, or provide examples. This is a problem.

@ericl If you can tell me what to do, I'll submit a PR for this this weekend after ICML

sven1977 commented 4 years ago

You basically set: dueling: True Q-learning with the dueling layer double: True double Q-loss function n_step: [some int > 1 and << 10] n-step bootstrapping batch_mode: "complete_episodes" must be set to this when running w/ parameter noise prioritized_replay: True run with a prioritized replay buffer (instead of a regular uniform buffer) num_atoms: [>1] switches on distributional Q-outputs (rather than single Q-value per action) v_min: -10.0 set these according to your expected returns. It'll split up this space into num_atoms discrete bins v_max: 10.0

Alternatives: noisy: True (switches on noisy/stochastic layers) <- rainbow paper OR: parameter_noise: True adds parameter noise for better exploration <- https://openai.com/blog/better-exploration-with-parameter-noise/

Will add this to the docs.

jkterry1 commented 4 years ago

So parameterizing the action space isn't required for all that right? You can just add this for any game?

On Tue, Feb 4, 2020 at 11:53 AM Sven Mika notifications@github.com wrote:

Closed #7035 https://github.com/ray-project/ray/issues/7035.

— You are receiving this because you authored the thread.

Reply to this email directly, view it on GitHub https://github.com/ray-project/ray/issues/7035?email_source=notifications&email_token=AEUF33GX2S5TSGSGEJE77DDRBGMP7A5CNFSM4KPNSU22YY3PNVWWK3TUL52HS4DFWZEXG43VMVCXMZLOORHG65DJMZUWGYLUNFXW5KTDN5WW2ZLOORPWSZGOWM4YQCI#event-3006892041, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEUF33GLFCGF2Z5YWFHPHRLRBGMP7ANCNFSM4KPNSU2Q .

-- Thank you for your time, Justin Terry