dgriff777 / rl_a3c_pytorch

A3C LSTM Atari with Pytorch plus A3G design
Apache License 2.0
563 stars 119 forks source link

cnn layers #9

Closed 404akhan closed 7 years ago

404akhan commented 7 years ago

in your model you have 4 cnn layers and max pooling. 1) dqn 2015 used only 3 cnn layers without pooling 2) a3c 2016 used only 2 cnn layers without pooling

questions: 1) don't you think pooling actually lose spatial information of RL scene, which imo is important, why you decided using pooling instead of increasing stride to 2? 2) why you decided using 4 cnn layers, possibly gym v0 environments are harder? 3) any particular reasons for such specific weight init (final actor/critic linear weights)?

thanks.

dgriff777 commented 7 years ago

As I see it as agent has to learn how to see and play the game I wanted to make sure agent had more then enough feature detections for images. The pooling was to reduce the size of overall model and is proven to be very effective with vision models. It wasn't as much cause v0 is harder but as I just thought the original cnn layers was insufficient and wanted to see how much it could learn if model had better tools to decipher scene.

The weights init are from the universe-starter-agent and borrowed pytorch implementation of it in pytorch-a3c repo. They seemed sufficient so saw no need to change.

404akhan commented 7 years ago

thanks for reply. 1) why not to use stride=2 to reduce size of model (repo pytorch-a3c did so), instead of pooling. max pooling lose spatial relation of where part of object came from, i think it is important in RL scene, though not yet tested influence. 2) i also saw at environment.py you use class NormalizedEnv, do you also think that using batchnorm layers will help after each cnn & possible after fc? i will try myself as well.

dgriff777 commented 7 years ago
  1. It's hard to really compare the use of CNN layers to pytorch-a3c as we don't have same input. Image is 80x80 whereas pytorch-a3c has 42x42. But I thought using stride 2 would be more ambiguous for model. I did try some configurations with stride 2 and didn't see it as superior.
  2. I'm not sure I mean could speed things up but overall end performance wouldn't improve from doing that
404akhan commented 7 years ago

thx, please close an issue if you want