utkuozbulak / pytorch-cnn-visualizations

Pytorch implementation of convolutional neural network visualization techniques
MIT License
7.81k stars 1.49k forks source link

ResNet "unfolding" the model to accessing conv2d layers. #82

Closed mbary closed 4 years ago

mbary commented 4 years ago

First of all thank you for the amazing work!

It's rather a question than an issue, I can't seem to add the "question" label to it.

I am using your Score-CAM implementation on both ResNet50 and 101.

As you know, and it's been mentioned in a couple threads, you have set it up on VGG which is conveniently divided into a feature extractor and classifier. Each module is also easily accessible from within the network. However, with ResNet it becomes slightly trickier due to their bottleneck architecture.

Could you please advise me whether my approach is correct?

    def forward_pass_on_convolutions(self, x):
        conv_output = None
        for module_pos, module in self.model._modules.items():
            print(module_pos)
            x = module(x)  # Forward
            print(f"x shape after {module_pos}", x.shape)
            if module_pos == self.target_layer:
                conv_output = x  # Save the convolution output on that layer
                return conv_output, x

Where module in module(x) is not a singular Conv2d layer but a Sequence of bottlenecks. It is possible to pass an input through a Sequence like that but I am just uncertain whether that is a correct approach.

Does is simply mean less granularity as it limits us to 5 convolutional layers (conv1, layer1-4) rather than having ~~50 convolutional layers (given we "unfold" the model).

It is possible and relatively easy to access the inner layers of the layers and bottlenecks, but is it necessary? It certainly does yield (I think) good results!

Conv2d output from layer2: image

Conv2d output from layer3: image

For those interested, other changes included:

    def forward_pass(self, x):

        # Forward pass on the convolutions
        conv_output, x = self.forward_pass_on_convolutions(x)
#         x = x.view(x.size(0), -1)  # Flatten

        # Forward pass on the classifier
       # AdaptiveAveragePooling
        x = self.model.avgpool(x)
        # Redefine the FC to match the 
        #conv layer and num of classes
        fc_in_feaures = x.shape[1]
        self.model.fc = nn.Linear(fc_in_feaures,10)
#         x = torch.transpose(x, 1, 0)
        x=x.view(x.size(0),-1)
        x = self.model.fc(x)
        return conv_output, x
utkuozbulak commented 4 years ago

Hey,

Indeed there have been many requests (with emails, github and even facebook) for a ResNet example and I have been reluctant to do so. The reason is not because it is particularly hard but because there are new architectures with funky layers every other month, not to mention custom models people come up with. I fear the moment the example for a non-standard model (non-standard in the sense that the layers are not simply separated to features->classifier like AlexNet/VGG, not that ResNet/Inception/DenseNet etc is used any less than AlexNet or VGG), people who do not want to invest time into learning what is going on underneath will just ask for more. Not to mention problems with the torch version (I'm still using torch 0.4.1).

The trick with ResNet or any other architecture that contains nested layers is targeting the correct layer. If I remember correctly, the fastest way I did it was with the usage of Module.modules() or Module.children() and with just changing a couple lines of code. This way I was able to target the last convolution layer of the ResNet bottleneck. You can use a similar approach to target whichever layer you want within the bottleneck.

Also, evaluating the visualizations is a tricky task. It is surprisingly easy to get into a selection bias.

Good luck

mbary commented 3 years ago

Hi Richard!

For whatever reason, I cannot view your question on Github, did you delete it? Also, out of curiosity, how exactly did you manage to email me? I was certain my email was private on github.

The package was based/built for the VGG network, which is explicitly divided into 2 parts: the extraction network and classification network. The resnet architecture does not have such a distinction, meaning that the package does not 'recognise' these layers in the network.

The "manual" part refers to simply passing them through these layers (red).

AdaptiveAveragePooling x = self.model.avgpool(x)

Redefine the FC to match the

    #conv layer and num of classes
    fc_in_feaures = x.shape[1]
    self.model.fc = nn.Linear(fc_in_feaures,10)

x = torch.transpose(x, 1, 0)

    x=x.view(x.size(0),-1)
    x = self.model.fc(x)
    return conv_output, x

If you compare it to the original code, these lines are not there https://github.com/utkuozbulak/pytorch-cnn-visualizations/blob/master/src/scorecam.py

The redefining og in/out_features (green) has to be done as depending on your inputs and how they're processed, the number of features varies. That way, no matter what the input dimensionality is, the features will always match.

Apart from that, I've found this solution (scoreCAM in general) quite problematic. Whenever I was extracting them, I found that it would completely change all of my weights (despite using torch.no_grad()) hence ruining all of the trained weights and messing up the model. I was pressed for time at the time, so I didn't really solve this issue.

I hope that clarifies things for you.

Regards.

On Wed, Nov 11, 2020 at 11:08 AM Richard Vijgen notifications@github.com wrote:

Hi mbary,

I am trying to replicate your approach to apply Score-CAM to resnet101 and replaced forward_pass_on_convolutions and forward_pass as described. You also describe two additional steps :

  • 'manually' passing it through the classification layers (avgpool and fc)
  • dynamically redefining the fully connected layer so that the in_features match the out_features of the previous layer (avgpool)

I don't fully understand these steps. Could you perhaps elaborate or share some code to point me in the right direction?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/utkuozbulak/pytorch-cnn-visualizations/issues/82#issuecomment-725331350, or unsubscribe https://github.com/notifications/unsubscribe-auth/AK47HBDHJSOQYMKGP3NQPL3SPJPBHANCNFSM4PRTMPWA .

rvijgen commented 3 years ago

Hi,

Thank you for taking the time to answer me. Yes, I did post the question on Github and then delete it because I wasn’t sure I phrased it right (so I don’t have your email :-)). Thanks again for elaborating, this is very helpful.

All the best,

Richard

On 12 Nov 2020, at 15:04, mbary notifications@github.com wrote:

Hi Richard!

For whatever reason, I cannot view your question on Github, did you delete it? Also, out of curiosity, how exactly did you manage to email me? I was certain my email was private on github.

The package was based/built for the VGG network, which is explicitly divided into 2 parts: the extraction network and classification network. The resnet architecture does not have such a distinction, meaning that the package does not 'recognise' these layers in the network.

The "manual" part refers to simply passing them through these layers (red).

AdaptiveAveragePooling x = self.model.avgpool(x)

Redefine the FC to match the

conv layer and num of classes

fc_in_feaures = x.shape[1] self.model.fc = nn.Linear(fc_in_feaures,10)

x = torch.transpose(x, 1, 0)

x=x.view(x.size(0),-1) x = self.model.fc(x) return conv_output, x

If you compare it to the original code, these lines are not there https://github.com/utkuozbulak/pytorch-cnn-visualizations/blob/master/src/scorecam.py

The redefining og in/out_features (green) has to be done as depending on your inputs and how they're processed, the number of features varies. That way, no matter what the input dimensionality is, the features will always match.

Apart from that, I've found this solution (scoreCAM in general) quite problematic. Whenever I was extracting them, I found that it would completely change all of my weights (despite using torch.no_grad()) hence ruining all of the trained weights and messing up the model. I was pressed for time at the time, so I didn't really solve this issue.

I hope that clarifies things for you.

Regards.

On Wed, Nov 11, 2020 at 11:08 AM Richard Vijgen notifications@github.com wrote:

Hi mbary,

I am trying to replicate your approach to apply Score-CAM to resnet101 and replaced forward_pass_on_convolutions and forward_pass as described. You also describe two additional steps :

  • 'manually' passing it through the classification layers (avgpool and fc)
  • dynamically redefining the fully connected layer so that the in_features match the out_features of the previous layer (avgpool)

I don't fully understand these steps. Could you perhaps elaborate or share some code to point me in the right direction?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/utkuozbulak/pytorch-cnn-visualizations/issues/82#issuecomment-725331350, or unsubscribe https://github.com/notifications/unsubscribe-auth/AK47HBDHJSOQYMKGP3NQPL3SPJPBHANCNFSM4PRTMPWA .

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/utkuozbulak/pytorch-cnn-visualizations/issues/82#issuecomment-726097593, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAOFSGL3CHEYYQ5FCZ44YXLSPPTOHANCNFSM4PRTMPWA.