abhaydoke09 / Bilinear-CNN-TensorFlow

This is an implementation of Bilinear CNN for fine grained visual recognition using TensorFlow.
191 stars 72 forks source link

Failure in implementation using PyTorch (sorry to open an issue here...) #18

Open JingyunLiang opened 6 years ago

JingyunLiang commented 6 years ago

When I am trying to implement it using PyTorch, the accuracy rises to 35% in first FC layer pretraining stage ( around epoch 10). In the second stage, however, the accuracy decreases to 20%. The key operations are outer production, average pooling, signed sqrt and L2 normalization. Codes are as follows:

# The definition of Bilinear CNN
# input: [batch, channel, height, width]
class VggBasedNet_bilinear(nn.Module):
    def __init__(self, originalModel):
        super(VggBasedNet_bilinear, self).__init__()
        # feature extraction from Conv5_3 with relu
        self.features = nn.Sequential(*list(original_vgg16.features)[:-1]) 

        self.classifier = nn.Linear(512 * 512, args.numClasses)

    def forward(self, x):
        # feature extraction from Conv5_3 with relu
        x = self.features(x).view(-1,512,784)

        #  outer production of features on each position over height*width; average pooling
        x = torch.matmul(x, x.permute(0,2,1)).view(-1,512*512)/784.0

        # signed sqrt
        x = torch.mul(torch.sign(x),torch.sqrt(torch.abs(x)+1e-12)) 

        # L2 normalization
        x = F.normalize(x, p=2, dim=1)

        # final FC layer
        x = self.classifier(x)

        return x

I am sure that there is no wrong in rest codes because I only changed the network structure based on a VGG16 fine-tuning script. Anyone who knows PyTorch? Is there any problem with above codes? Can they achieve their corresponding function?

theFool32 commented 6 years ago

I run into similary result as you. Any solution have you found? Thanks.