zylo117 / Yet-Another-EfficientDet-Pytorch

The pytorch re-implement of the official efficientdet with SOTA performance in real time and pretrained weights.
GNU Lesser General Public License v3.0
5.2k stars 1.27k forks source link

Whether that there has redundant layer definition in BiFPN? #412

Open seekFire opened 4 years ago

seekFire commented 4 years ago

In the class BiFPN, the layer self.p4_down_channel has the same definition with the layer self.p4_down_channel_2, same situation for self.p5_down_channel and self.p5_down_channel_2, so why you use self.p4_down_channel_2 to generate p4_in again rather than use p4_in generated by self.p4_down_channel directly? the questionable code is as follows:

if self.first_time:
    p4_in = self.p4_down_channel_2(p4)
    p5_in = self.p5_down_channel_2(p5)
zylo117 commented 4 years ago

Can you understand the illustration of BiFPN here? https://github.com/zylo117/Yet-Another-EfficientDet-Pytorch/blob/master/efficientdet/model.py#L152

Every single line means individual input and output, so there are no shared weights in one BiFPN module. As for whether they can be shared, you will have to experiment on them. I think it's ok to share those two layers. But since the original author separates them, I do the same.

seekFire commented 4 years ago

@zylo117 All right, but why only p4_in and p5_in are individual, why p6_in is shared?

zylo117 commented 4 years ago

They aren't, there is no downchannel ops on P6 and P7.