Kaixhin / Atari

Persistent advantage learning dueling double DQN for the Arcade Learning Environment
MIT License
264 stars 73 forks source link

Fix bootstrapped DQN #7

Open Kaixhin opened 8 years ago

Kaixhin commented 8 years ago

The test on Beam Rider is failing badly, and does not look promising.

iassael commented 8 years ago

Hi @Kaixhin, did you try using my layer?

Kaixhin commented 8 years ago

@iassael couple of questions about your layer. Can it use more complicated heads (like the dueling head)? How does it work on picking a new head for a new episode vs. using the mode in ensemble mode (during evaluation)? Is it possible to train with the "full" version of the bootstrap - when each head requires a separate experience replay memory?

iassael commented 8 years ago

hey @Kaixhin currently nope. For the former we could pass the module as a parameter, and for the latter it should be super easy to extend it with an extra parameter of the episode id.

Kaixhin commented 8 years ago

@iassael I'm focusing on some of the other components at the moment so I'm not sure I'll get to this any time soon, but feel free to give it a shot if you can.

iassael commented 8 years ago

@Kaixhin I'll keep you posted and thanks for the awesome work cheers~