google-deepmind / dqn_zoo

DQN Zoo is a collection of reference implementations of reinforcement learning agents developed at DeepMind based on the Deep Q-Network (DQN) agent.
Apache License 2.0
456 stars 78 forks source link

Expectile Regression Implementation? #1

Closed RylanSchaeffer closed 4 years ago

RylanSchaeffer commented 4 years ago

The readme doesn't mention an implementation of expectile regression (Statistics and Samples in Distributional Reinforcement Learning - http://proceedings.mlr.press/v97/rowland19a). Is one in the works or will be soon?

GeorgOstrovski commented 4 years ago

Hi Rylan, thanks for your interest!

With this release, we did not aim for full coverage of the space of DQN-based agents - many more variants published over the years could be included here. Rather we wanted to provide a set of classical agents representative of this space.

By including a variety of agents with features like dueling networks, prioritized replay, distributional RL, noisy networks, we were hoping to sufficiently densely cover this space so that other variants (like the above mentioned expectile regression) could be developed by users themselves reasonably easily by forking & modifying one of the existing variants.

Being selective and keeping the set relatively compact allowed us to carefully evaluate all included agents and provide reliable and complete evaluation data on 57 Atari games x 5 seeds, which was one of our main goals.

RylanSchaeffer commented 4 years ago

Ok thanks for letting me know!

RylanSchaeffer commented 4 years ago

@GeorgOstrovski are you open to PRs? I imagine Expectile-Regression-Naive would be quite simple to create from the modified quantile regression implementation.

jqdm commented 4 years ago

We're not open to PRs right now, see this part of the README.md for more detail.

RylanSchaeffer commented 4 years ago

In that case, can I leave this issue open? I'd like an expectile regression implementation and the contribution.md says to open an issue if one is keen on a particular change. Many of the expectile regression authors are at DeepMind and that algorithm uses the DQN torso on the Atari games, meaning it fits nicely within this class of DQN-based RL discrete control agents. I also don't think it would be too much work to adapt from QR-DQN.

jqdm commented 4 years ago

We’re grateful for these suggestions and will consider them case-by-case. In this instance it is an explicit non-goal to incorporate as many DQN variants as we can. As Georg already explained above being selective means that we can provide evaluation data on all 57 games. While it may not be much work to create an Expectile Regression DQN, the compute budget for a thorough evaluation is substantial.