kaiyuyue / cgnl-network.pytorch

Compact Generalized Non-local Network (NIPS 2018)
https://arxiv.org/abs/1810.13125
MIT License
259 stars 41 forks source link

Question about experiments #5

Closed lxtGH closed 5 years ago

lxtGH commented 5 years ago

Hi! Thanks for your code and paper. I have several question about this work. (a). In your paper, the results are a little lower than this repo. why? (about 1%) (b). In your paper, you also insert 5 NL blocks in resnet, what are the specific positions of these blocks? (c). Have you inserted 5 NL/GCNL blocks when training on ImageNet? Many thanks !

lxtGH commented 5 years ago

Also, I only got 86.4 results using res50 + cgnl and I tried 4 times training. How to achieve 87 on val dataset ?

kaiyuyue commented 5 years ago

Hi Doctor, thanks for the interests.

We add 1 block (to res4), 5 blocks (3 to res4 and 2 to res3, to every other residual block), and 10 blocks (to every residual block in res3 and res4) in ResNet-50; in ResNet-101 we add them to the corresponding residual blocks.

lxtGH commented 5 years ago

Hi! Many thanks. I also have a question about training on ImageNet, Did you use pretrained weight as well on ImageNet like training on cub?

kaiyuyue commented 5 years ago

Yes, suppose I understand right about what you mean. I use the ImageNet pretrained weight to train the model with 1 CGNL block on ImageNet. Moreover, I also use warmup strategy.

lxtGH commented 5 years ago

Hi! Why you use pretrained model rather than train from sracth ?? @KaiyuYue

kaiyuyue commented 5 years ago

Hi~! Using pre-trained model to train it on ImageNet with cgnl/nl modules can promise the better accuracy. So we choose this training way starting from the very first experiments. We did not try it from scratch. It's welcome to share experimental results achieved from scratch if you have.