Ugness / PiCANet-Implementation

Pytorch Implementation of PiCANet: Learning Pixel-wise Contextual Attention for Saliency Detection
MIT License
180 stars 40 forks source link

all of the conv kernels in DecoderCell(nn.Module) should be set to 1x1 #8

Closed Sucran closed 6 years ago

Sucran commented 6 years ago

Hi, @Ugness , From the paepr, Page 3094, Section 5.3 Network structure: In the decoding modules, all of the convolutional kernels in Figure 3(b) are set to 1x1. But you set the conv1 and conv2 's kernel size as 3x3, and comment that "# not specified in paper" I think this part should be adjust.

Besides, the conv5 has dilation of 2, from the page 3092, Section 4, paragraph 2. In the code, you just build a vgg without dilation.

Ugness commented 6 years ago

I didn't know it. I'll adjust the code and test, ASAP. Thanks a lot. But I'm afraid that if I fix the network structure, pre-trained datas will not work with it. Since I don't know a lot about github, do you have any idea to make both models be possible? (like making branch)

Sucran commented 6 years ago

Yes, the pre-trained datas will not work anymore if we still load torchvision's vgg16 pre-trained weight, because the convolution kernels with dilation has larger parameter size than that without dilation. I also have no idea how to convert a pre-trained conv kernel as a conv kernel with dilation. If I have find any idea can work for this, I will fork your repo and summit a pull request for you.

Ugness commented 6 years ago

Thank you.

Sucran commented 6 years ago

Hi, @Ugness , I found something! The conv5 with dilation 2 and change the stride of pool4 and pool5 from 2 to 1, these ops all from DeepLabV2, the author of PiCANet directly load the caffemodel provied from DeepLab website. (The ResNet are the same). So, I directly find some pytorch repo for DeepLabV2, and I found a convert op for convert caffemodel to pth file in this Link https://github.com/kazuto1011/deeplab-pytorch/blob/master/convert.py If you have time recently, you can directly add this module for your repo, and obtrain the pre-trained weights for correct network.

Ugness commented 6 years ago

I adjusted the code and start to train. I hope the result would be better. You can find adjusted code from other branch. And I did not exactly understand your DeepLabV2 advice. Did you mean loading pre-trained Resnet backbone from DeepLabV2? I can load Resnet backbone from torchvision.models.resnet. But I could not start the Resnet backbone because of my schedule. Anyway, thank you so much.

Ugness commented 6 years ago

Ah, I found your point. I didn't realize your backbone-problem, but pytorch automatically matched weights. (dilation 1 -> dilation 2)👍

Thanks a lot.

Sucran commented 6 years ago

@Ugness So, the pytorch can automatically matched weights when adjust dilation from 1 to 2 ? Amazing~Hope new model can reach the author's performance.

Ugness commented 6 years ago

I hope so too. Maybe I can report the result of new model about next week or this weekend.

Sucran commented 6 years ago

Thank you so much, and may be it is time to close this issue.