Closed 404akhan closed 7 years ago
As I see it as agent has to learn how to see and play the game I wanted to make sure agent had more then enough feature detections for images. The pooling was to reduce the size of overall model and is proven to be very effective with vision models. It wasn't as much cause v0 is harder but as I just thought the original cnn layers was insufficient and wanted to see how much it could learn if model had better tools to decipher scene.
The weights init are from the universe-starter-agent and borrowed pytorch implementation of it in pytorch-a3c repo. They seemed sufficient so saw no need to change.
thanks for reply.
1) why not to use stride=2 to reduce size of model (repo pytorch-a3c did so), instead of pooling. max pooling lose spatial relation of where part of object came from, i think it is important in RL scene, though not yet tested influence.
2) i also saw at environment.py
you use class NormalizedEnv
, do you also think that using batchnorm layers will help after each cnn & possible after fc? i will try myself as well.
thx, please close an issue if you want
in your model you have 4 cnn layers and max pooling. 1) dqn 2015 used only 3 cnn layers without pooling 2) a3c 2016 used only 2 cnn layers without pooling
questions: 1) don't you think pooling actually lose spatial information of RL scene, which imo is important, why you decided using pooling instead of increasing stride to 2? 2) why you decided using 4 cnn layers, possibly gym v0 environments are harder? 3) any particular reasons for such specific weight init (final actor/critic linear weights)?
thanks.