zhanghang1989 / ResNeSt

ResNeSt: Split-Attention Networks
https://arxiv.org/abs/2004.08955
Apache License 2.0
3.24k stars 496 forks source link

Issue with PyTorch dropblock #100

Open samjkwong opened 4 years ago

samjkwong commented 4 years ago

Hello! In the Torch version of Bottleneck, it appears that self.dropblock2 is never used.

self.dropblock2 is initialized in lines 52-56 if radix == 1: https://github.com/zhanghang1989/ResNeSt/blob/master/resnest/torch/resnet.py#L52-L56

        if dropblock_prob > 0.0:
            self.dropblock1 = DropBlock2D(dropblock_prob, 3)
            if radix == 1:
                self.dropblock2 = DropBlock2D(dropblock_prob, 3)
            self.dropblock3 = DropBlock2D(dropblock_prob, 3)

But it is only called in lines 107-110 if radix == 0: https://github.com/zhanghang1989/ResNeSt/blob/master/resnest/torch/resnet.py#L107-110

        if self.radix == 0:
            out = self.bn2(out)
            if self.dropblock_prob > 0.0:
                out = self.dropblock2(out)

Meanwhile, in the Gluon version of Bottleneck, self.dropblock2 is used regardless of self.use_splat being true or not. Just wanted to point this out, and inquire if the drop in accuracy for the PyTorch implementation compared to Gluon implementation you mentioned could be attributed to this inconsistency.

zhanghang1989 commented 4 years ago

Thanks for pointing out the issue in the pytorch code! Will take a look.