Why there is no ReLU in FPN output

roytseng-tw / Detectron.pytorch

A pytorch implementation of Detectron. Both training from scratch and inferring directly from pretrained Detectron weights are available.

MIT License

2.82k stars 565 forks source link

Why there is no ReLU in FPN output #114

Closed bowenc0221 closed 6 years ago

bowenc0221 commented 6 years ago

Hi @roytseng-tw ,

Thanks for the nice work. I have a question about the FPN code: why there is no ReLU activation in the Post-hoc scale-specific 3x3 convolutions as well as the lateral connection?

roytseng-tw commented 6 years ago

First of all, there is no ReLU there in Detectron's FPN implementation either. Here's my opinion on why:

post-hoc scale-specific 3x3 convs: the outputs are used to extract roi features, so it's better to be dense.
lateral convs: the outputs are fused with the features one-level above, so you want them to have same kind of distribution.

ppwwyyxx commented 6 years ago

This exact question is answered in the paper already:

There are no non-linearities in these extra layers, which we have empirically found to have minor impacts

bowenc0221 commented 6 years ago

Thanks!