Pangoraw / PaDiM

:dolphin: Re-implementation of PaDiM and code for the article "Weakly Supervised Detection of Marine Animals in High Resolution Aerial Images"
https://arxiv.org/abs/2011.08785
MIT License
27 stars 4 forks source link

How is the embedding_size for the backbones calculated? #6

Closed SDJustus closed 3 years ago

SDJustus commented 3 years ago

Hi,

i am trying to implement efficientnetb5 from the paper in my fork and i am wondering, how u calculated the embedding_size for each backbone you implemented yet?

Best regards Justus

SDJustus commented 3 years ago

this is my implementation (did not fetch your latest backbone refactoring yet):

from typing import Tuple

from torch import Tensor
from torch.nn import Module
from efficientnet_pytorch import EfficientNet

class EfficientNetB5(Module):
    embeddings_size = 448
    num_patches = 32 * 32

    def __init__(self) -> None:
        super().__init__()
        self.efficientnetb5 = EfficientNet.from_pretrained('efficientnet-b5')

    def forward(self, x: Tensor) -> Tuple[Tensor, Tensor, Tensor]:
        """Return the three intermediary layers from the EfficientNetB5
        pre-trained model.
        Params
        ======
            x: Tensor - the input tensor of size (b * c * w * h)
        Returns
        =======
            feature_1: Tensor - the residual from layer 1
            feature_2: Tensor - the residual from layer 2
            feature_3: Tensor - the residual from layer 3
        """
        x = self.efficientnetb5._conv_stem(x)
        x = self.efficientnetb5._bn0(x)

        feature_not_using_1 = self.efficientnetb5._blocks[0](x)
        feature_1 = self.efficientnetb5._blocks[1](feature_not_using_1)
        feature_not_using_2 = self.efficientnetb5._blocks[2](feature_1)
        feature_2 = self.efficientnetb5._blocks[3](feature_not_using_2)
        feature_3 = self.efficientnetb5._blocks[4](feature_2)

        return feature_1, feature_2, feature_3

sadly, the embedding_size is wrong (copied from resnet-18)

Pangoraw commented 3 years ago

The embedding size is the sum of the channel depth of feature_1, feature_2 and feature_3.

l1 = feature_1.size(1)
l2 = feature_2.size(1)
l3 = feature_3.size(1)
num_embeddings = l1 + l2 + l3
SDJustus commented 3 years ago

sorry for the late response. Closing this issue. Thank your very much for answering!