polarisZhao / PFLD-pytorch

PFLD pytorch Implementation
798 stars 197 forks source link

PFLD 0.25X implementation #23

Open in-die-nibelungen opened 4 years ago

in-die-nibelungen commented 4 years ago

I'd like to try out PFLD 0.25X performance, but it's not provided. I made some changes on __init__() of pfld.py as follows:

class PFLDInference(nn.Module):
    def __init__(self, width_mult=1.0):
        super(PFLDInference, self).__init__()
        assert width_mult in [0.25, 1.0]
        layer_channels=int(64*width_mult)
        layer_channels2=layer_channels*2
        self.conv1 = nn.Conv2d(
            3, layer_channels, kernel_size=3, stride=2, padding=1, bias=False)
        self.bn1 = nn.BatchNorm2d(layer_channels)
        self.relu = nn.ReLU(inplace=True)

        self.conv2 = nn.Conv2d(
            layer_channels, layer_channels, kernel_size=3, stride=1, padding=1, bias=False)
        self.bn2 = nn.BatchNorm2d(layer_channels)
        self.relu = nn.ReLU(inplace=True)

        self.conv3_1 = InvertedResidual(layer_channels, layer_channels, 2, False, 2)

        self.block3_2 = InvertedResidual(layer_channels, layer_channels, 1, True, 2)
        self.block3_3 = InvertedResidual(layer_channels, layer_channels, 1, True, 2)
        self.block3_4 = InvertedResidual(layer_channels, layer_channels, 1, True, 2)
        self.block3_5 = InvertedResidual(layer_channels, layer_channels, 1, True, 2)

        self.conv4_1 = InvertedResidual(layer_channels, layer_channels2, 2, False, 2)

        self.conv5_1 = InvertedResidual(layer_channels2, layer_channels2, 1, False, 4)
        self.block5_2 = InvertedResidual(layer_channels2, layer_channels2, 1, True, 4)
        self.block5_3 = InvertedResidual(layer_channels2, layer_channels2, 1, True, 4)
        self.block5_4 = InvertedResidual(layer_channels2, layer_channels2, 1, True, 4)
        self.block5_5 = InvertedResidual(layer_channels2, layer_channels2, 1, True, 4)
        self.block5_6 = InvertedResidual(layer_channels2, layer_channels2, 1, True, 4)

        self.conv6_1 = InvertedResidual(layer_channels2, 16, 1, False, 2)  # [16, 14, 14]

        self.conv7 = conv_bn(16, 32, 3, 2)  # [32, 7, 7]
        self.conv8 = nn.Conv2d(32, 128, 7, 1, 0)  # [128, 1, 1]
        self.bn8 = nn.BatchNorm2d(128)

        self.avg_pool1 = nn.AvgPool2d(14)
        self.avg_pool2 = nn.AvgPool2d(7)
        self.fc = nn.Linear(176, 196)

I'm thinking that I can get PFLD 0.25X with width_mult=0.25. Is this correct?

Thanks in advance.

cscribano commented 4 years ago

Im'interested too. I implemented the width_mult following the implementation in torchvision's Mobilenet_v2, this lead to a change also for 1X version being that conv5_1 should use residual connection. In addition the change in number of channels in block3_5 output (16 instead of 64), used as input for AuxiliaryNet pose must be dealed with.

My intuition so far was to redefine AuxiliaryNet as follow:

    def __init__(self, c=64):
        super(AuxiliaryNet, self).__init__()
        self.conv1 = conv_bn(c, c*2, 3, 2)
        self.conv2 = conv_bn(c*2, c*2, 3, 1)
        self.conv3 = conv_bn(c*2, c//2, 3, 2)
        self.conv4 = conv_bn(c//2, c*2, 7, 1)
        self.max_pool1 = nn.MaxPool2d(3)
        self.fc1 = nn.Linear(c*2, c//2)
        self.fc2 = nn.Linear(c//2, 3)

I am currently training this version, I have very few hopes of this working.