werner-duvaud / muzero-general

MuZero
https://github.com/werner-duvaud/muzero-general/wiki/MuZero-Documentation
MIT License
2.5k stars 611 forks source link

Average pooling after residual tower #15

Closed fidel-schaposnik closed 4 years ago

fidel-schaposnik commented 4 years ago

Hi! Could you provide a reference for the average pooling layer inserted in the residual networks after the tower and before the value, reward and policy heads? Can't seem to find any sign of it in the papers...

ahainaut commented 4 years ago

Hi, you are right, we decided to add one average pooling layer to reduce the computation time in case the size of the hidden state was too large. In any case, fixing the parameters of the pooling (self.pooling_size and self.pooling_stride) to 1 won't modify the output of the tower.