Question about the pooling layer

facebookresearch / SlowFast

PySlowFast: video understanding codebase from FAIR for reproducing state-of-the-art video models.

Apache License 2.0

6.56k stars 1.21k forks source link

Hi, thank you for sharing the great codebase. I am looking at the SlowFast model builder, and find an extra pooling layer between res2 and res3 stage. I didn't find it in the paper.

https://github.com/facebookresearch/SlowFast/blob/master/slowfast/models/video_model_builder.py#L219-L225

In addition, both the kernel size and the strides are [1,1,1], which seems like no pooling is performed. The output has the same shape as the input.

Actually, when I read the paper, it says on page 3 that

In our instantiations, we use no temporal downsampling layers (neither temporal pooling nor time-strided convolutions) throughout the Fast pathway, until the global pooling layer before classification.

So this pooling layer shouldn't be here according to the paper, at least for fast pathway. Can you clarify more on this pooling layer, such as why we need it? Thank you.

facebookresearch / SlowFast

Question about the pooling layer #11