google / dopamine

Dopamine is a research framework for fast prototyping of reinforcement learning algorithms.
https://github.com/google/dopamine
Apache License 2.0
10.42k stars 1.36k forks source link

Will you provide QR-DQN code? #144

Closed GoingMyWay closed 3 years ago

GoingMyWay commented 3 years ago

Will you provide QR-DQN code?

psc-g commented 3 years ago

planning on it, stay tuned! :)

On Sat, Jul 25, 2020 at 11:54 AM Alexander notifications@github.com wrote:

Will you provide QR-DQN code?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/google/dopamine/issues/144, or unsubscribe https://github.com/notifications/unsubscribe-auth/AE3CCMIANCVBNYIOTAD34A3R5L5Z7ANCNFSM4PHQRGJQ .

GoingMyWay commented 3 years ago

planning on it, stay tuned! :) On Sat, Jul 25, 2020 at 11:54 AM Alexander @.***> wrote: Will you provide QR-DQN code? — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#144>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AE3CCMIANCVBNYIOTAD34A3R5L5Z7ANCNFSM4PHQRGJQ .

Great, Can I ask how you do tuning and finding the optimal hyperparameters? By grid search?

psc-g commented 3 years ago

for the configs we've been releasing with dopamine we use the published settings. however, for DQN and C51 we used some of the settings from Rainbow when it was published (although we do provide configs with the settings used when each of those agents was published).

On Tue, Jul 28, 2020 at 3:51 AM Alexander notifications@github.com wrote:

planning on it, stay tuned! :) … <#m6244524444603058604> On Sat, Jul 25, 2020 at 11:54 AM Alexander @.***> wrote: Will you provide QR-DQN code? — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#144 https://github.com/google/dopamine/issues/144>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AE3CCMIANCVBNYIOTAD34A3R5L5Z7ANCNFSM4PHQRGJQ .

Great, Can I ask how you do tuning and finding the optimal hyperparameters? By grid search?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/google/dopamine/issues/144#issuecomment-664838771, or unsubscribe https://github.com/notifications/unsubscribe-auth/AE3CCMKYDOXYVJ5BWX76ZLDR5Z7PNANCNFSM4PHQRGJQ .

GoingMyWay commented 3 years ago

for the configs we've been releasing with dopamine we use the published settings. however, for DQN and C51 we used some of the settings from Rainbow when it was published (although we do provide configs with the settings used when each of those agents was published). On Tue, Jul 28, 2020 at 3:51 AM Alexander @.> wrote: planning on it, stay tuned! :) … <#m6244524444603058604> On Sat, Jul 25, 2020 at 11:54 AM Alexander @.> wrote: Will you provide QR-DQN code? — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#144 <#144>>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AE3CCMIANCVBNYIOTAD34A3R5L5Z7ANCNFSM4PHQRGJQ . Great, Can I ask how you do tuning and finding the optimal hyperparameters? By grid search? — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#144 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AE3CCMKYDOXYVJ5BWX76ZLDR5Z7PNANCNFSM4PHQRGJQ .

Thanks.

RylanSchaeffer commented 3 years ago

@GoingMyWay how'd you decide on Dopamine for distributional RL? What are your thoughts on DeepMind's acme?

GoingMyWay commented 3 years ago

@GoingMyWay how'd you decide on Dopamine for distributional RL? What are your thoughts on DeepMind's acme?

Hi, Dopamine has implemented C51, IQN, Rainbow and it is stable now, you can easily implement other Q-based distributional RL. For ACME, I did not use it before, it is very new, so maybe not stable and has some unfound bugs.

There are many mature frameworks like ACME, for example, ray.io, it also provides RL APIs for fast RL algorithm implementation.

psc-g commented 3 years ago

coming back to the original question on this thread, we now have a JAX implementation of QR-DQN: https://github.com/google/dopamine/blob/master/dopamine/jax/agents/quantile/quantile_agent.py

On Tue, Aug 4, 2020 at 12:35 AM Alexander notifications@github.com wrote:

@GoingMyWay https://github.com/GoingMyWay how'd you decide on Dopamine for distributional RL? What are your thoughts on DeepMind's acme?

Hi, Dopamine has implemented C51, IQN, Rainbow and it is stable now, you can easily implement other Q-based distributional RL. For ACME, I did not use it before, it is very new, so maybe not stable and has some unfound bugs.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/google/dopamine/issues/144#issuecomment-668374111, or unsubscribe https://github.com/notifications/unsubscribe-auth/AE3CCMPPSM2YURDMM754JPTR66FZBANCNFSM4PHQRGJQ .

GoingMyWay commented 3 years ago

coming back to the original question on this thread, we now have a JAX implementation of QR-DQN: https://github.com/google/dopamine/blob/master/dopamine/jax/agents/quantile/quantile_agent.py On Tue, Aug 4, 2020 at 12:35 AM Alexander @.***> wrote: @GoingMyWay https://github.com/GoingMyWay how'd you decide on Dopamine for distributional RL? What are your thoughts on DeepMind's acme? Hi, Dopamine has implemented C51, IQN, Rainbow and it is stable now, you can easily implement other Q-based distributional RL. For ACME, I did not use it before, it is very new, so maybe not stable and has some unfound bugs. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#144 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AE3CCMPPSM2YURDMM754JPTR66FZBANCNFSM4PHQRGJQ .

Great, @RylanSchaeffer You can try Dopamine.