Failure in implementation using PyTorch (sorry to open an issue here...)

When I am trying to implement it using PyTorch, the accuracy rises to 35% in first FC layer pretraining stage ( around epoch 10). In the second stage, however, the accuracy decreases to 20%. The key operations are outer production, average pooling, signed sqrt and L2 normalization. Codes are as follows:

# The definition of Bilinear CNN
# input: [batch, channel, height, width]
class VggBasedNet_bilinear(nn.Module):
    def __init__(self, originalModel):
        super(VggBasedNet_bilinear, self).__init__()
        # feature extraction from Conv5_3 with relu
        self.features = nn.Sequential(*list(original_vgg16.features)[:-1]) 

        self.classifier = nn.Linear(512 * 512, args.numClasses)

    def forward(self, x):
        # feature extraction from Conv5_3 with relu
        x = self.features(x).view(-1,512,784)

        #  outer production of features on each position over height*width; average pooling
        x = torch.matmul(x, x.permute(0,2,1)).view(-1,512*512)/784.0

        # signed sqrt
        x = torch.mul(torch.sign(x),torch.sqrt(torch.abs(x)+1e-12)) 

        # L2 normalization
        x = F.normalize(x, p=2, dim=1)

        # final FC layer
        x = self.classifier(x)

        return x

I am sure that there is no wrong in rest codes because I only changed the network structure based on a VGG16 fine-tuning script. Anyone who knows PyTorch? Is there any problem with above codes? Can they achieve their corresponding function?

abhaydoke09 / Bilinear-CNN-TensorFlow

Failure in implementation using PyTorch (sorry to open an issue here...) #18