yangsenius / TransPose

PyTorch Implementation for "TransPose: Keypoint localization via Transformer", ICCV 2021.
https://github.com/yangsenius/TransPose/releases/download/paper/transpose.pdf
MIT License
353 stars 56 forks source link

Fine-tuning model does'nt learn while training #32

Closed mukeshnarendran7 closed 2 years ago

mukeshnarendran7 commented 2 years ago

. I am using these parameters found in the main git but Fine-tuning model doesn't seem to learn from new data

criterion = torch.nn.MSELoss(reduction="mean")
pretrain_part = [param for name, param in model.named_parameters()if 'final_layer' not in name]
optimizer = torch.optim.Adam([ {'params': pretrain_part, 'lr':1e-5 },

model loading

#Load pre-trained model and fine-tune
model = torch.hub.load('yangsenius/TransPose:main', 
                       'tpr_a4_256x192',
                       pretrained=True)
for param in model.parameters():
    param.requires_grad = False

model.final_layer =  torch.nn.Sequential(torch.nn.Conv2d(256, 18,1))                                    
model = model.to(device)

#Get model summary
summary(model,
          input_size=(3, 256, 192),
          batch_size=1
          )

Loss: Epoch:0, time22.861s, loss0.49620020389556885 Epoch:1, time23.730s, loss0.28467278741300106 Epoch:2, time23.186s, loss0.19562865188345313 Epoch:3, time23.089s, loss0.15849934425204992 Epoch:4, time23.006s, loss0.1431811167858541 Epoch:5, time22.985s, loss0.1366584706120193 Epoch:6, time23.165s, loss0.13364548794925213 Epoch:7, time23.167s, loss0.13212952483445406 Epoch:8, time22.938s, loss0.13107420597225428 Epoch:9, time22.993s, loss0.13025714457035065 Epoch:10, time22.988s, loss0.12950784899294376 Epoch:11, time22.960s, loss0.12873529829084873 Epoch:12, time22.971s, loss0.12812337931245565 Epoch:13, time23.206s, loss0.12742456002160907 Epoch:14, time23.180s, loss0.12706445762887597

yangsenius commented 2 years ago
for param in model.parameters():
    param.requires_grad = False

Why setting requires_grad= False? This will freeze all parameters

mukeshnarendran7 commented 2 years ago

I was trying to use only the final layer as a feature extractor and this was causing the problem. It is still the same with the new dataset now after removing the frozen layers setting (for param in model.parameters(): param.requires_grad = True).

When i train for many epochs then the networks loss goes down and all but then the predictions are off even on the training set. I have trained for more than 100 epochs and is till the same Epoch:0, loss0.7756728883832693, time taken:14.116s Epoch:1, loss0.7343646492809057, time taken:13.354s Epoch:2, loss0.703302716370672, time taken:13.306s Epoch:3, loss0.6708760429173708, time taken:13.363s Epoch:4, loss0.6438880995847285, time taken:13.364s Epoch:5, loss0.6167034558020532, time taken:13.387s Epoch:6, loss0.5937230880372226, time taken:13.265s Epoch:7, loss0.569108291529119, time taken:13.304s Epoch:8, loss0.5474147114437073, time taken:13.370s Epoch:9, loss0.5261419415473938, time taken:13.291s Epoch:10, loss0.5088018793612719, time taken:13.319s Epoch:11, loss0.49403429706580937, time taken:13.259s Epoch:12, loss0.47912345826625824, time taken:13.361s Epoch:13, loss0.46565791056491435, time taken:13.328s Epoch:14, loss0.4500332986935973, time taken:13.257s Epoch:15, loss0.439005748834461, time taken:13.261s Epoch:16, loss0.4257299837190658, time taken:13.302s

Thanks

yangsenius commented 2 years ago
  1. Nope. I made very simple modification. Just replace the final layer 1x1 Conv(d, 17) with 1x1 Conv(d, 16). No freezing for previous layers but 1e-5 fixed learning rate for the pretrained layers.
  2. 100 epochs
  3. Just like this https://github.com/yangsenius/TransPose/issues/31#issuecomment-1041048153. I made very simple changes
  4. Make sure that you have correctly loaded the COCO pretrained weights for the pretrained parts (excluding the final layer) AND correct learning rate settings
mukeshnarendran7 commented 2 years ago

Hey Thanks for getting back

100% 22.9M/22.9M [00:00<00:00, 84.4MB/s]

Successfully loaded model (on cpu) with pretrained weights

mukeshnarendran7 commented 2 years ago

Just to share with you on how the model learns when i train on my data the outcomes looks like. Any idea why it might be? image

Some of my target heatmaps are like this within the frame but i get these very small values whereas my minimum of the heatmaps should be zero image

image

yangsenius commented 2 years ago

Are the positions of the GT keypoints right? From the target heatmap, it seems they are not matched with the input image.

mukeshnarendran7 commented 2 years ago

Hey, The above images are not related but you can see in this one I have plotted the kps over the image and they match and its in all the cases of images that they match image

I tried with another dataset and you can see after training for 100 epochs it's not able to generalize to my data. image image

I tried to use the pre-trained weight to load the model directly to plot on an image from the internet and you can see that it is not able to predict.

#Load pre-trained model and fine-tune
model_tp = torch.hub.load('yangsenius/TransPose:main', 
                       'tpr_a4_256x192',
                       pretrained=True)
 for name,param in model_tp.named_parameters():
       param.requires_grad = True

model = model_tp
#training params
criterion = torch.nn.MSELoss(reduction="mean")
pretrain_part = [param for name, param in model.named_parameters()if 'final_layer' not in name]
optimizer = torch.optim.Adam([ {'params': pretrain_part, 'lr':1e-5 },
                                                     {'params': model.final_layer.parameters(), 'lr': 1e-4}])

#pre-trained mean/std
mean=[0.485, 0.456, 0.406]
std=[0.229, 0.224, 0.225]

trfm = transforms.Compose([
        transforms.ToTensor(),
        transforms.Normalize(mean=mean,
                            std=std)])
model.eval()
with torch.no_grad():

    image = cv2.imread("/content/RN.jpg",1) #image resize
    image = cv2.resize(image, (256, 192))
    temp  = image #for plotting purposes
    image = trfm(image) # normalize, to tensor
    image = image[None].float()
    output = model(image)
    #Get keypoints (used function from yangseius github repo)
    preds, maxvals = get_max_preds(output.clone().cpu().numpy())

    # im = image.squeeze(0).permute(1,2,0), (64, 48)
    x = preds[0][:,0] *(256/64),
    y = preds[0][:,1] *(192/48)
    plt.figure(figsize=(15, 8))
    plt.imshow(temp)
    plt.scatter(x, y, c="r")
    plt.show()

Predictions: image image

Am I loading the model wrong? I referenced the Pytorch docs and they seem okay. If you write a demo code on how to load it from torch.hub() for pretraining would be helpful. Thanks

Here's mine

#Load pre-trained model and fine-tune
model_tp = torch.hub.load('yangsenius/TransPose:main', 
                       'tpr_a4_256x192',
                       pretrained=True)
for name,param in model_tp.named_parameters():
    param.requires_grad = True

model_tp.final_layer =  torch.nn.Sequential(torch.nn.Conv2d(256, 16, kernel_size=1))                                    
model = model_tp.to(device)

#Get model summary
summary(model,
          input_size=(3, 256, 192),
          batch_size=1
          )

for pretraining i referred to this tutorial: https://pytorch.org/tutorials/beginner/finetuning_torchvision_models_tutorial.html

mukeshnarendran7 commented 2 years ago

Hey sorry to spam you so many time but i have a good update.

#Load pre-trained model and fine-tune
model_tp = torch.hub.load('yangsenius/TransPose:main',
                          'tph_a4_256x192',
                          pretrained=True)
# for name,param in model_tp.named_parameters():
#     param.requires_grad = True

model = model_tp

#pre-trained mean/std
mean=[0.485, 0.456, 0.406]
std=[0.229, 0.224, 0.225]

trfm = transforms.Compose([
        transforms.ToTensor(),
        transforms.Normalize(mean=mean,
                            std=std)])
model.eval()
with torch.no_grad():

    image = cv2.imread("/content/messi.jpg",1) #image resize
    image = cv2.resize(image, (256, 192))
    temp  = image #for plotting purposes
    image = trfm(image) # normalize, to tensor
    image = image[None].float()
    output = model(image.to('cpu'))
    #Get keypoints (used function from yangseius github repo)
    preds, maxvals = get_max_preds(output.clone().cpu().numpy())

    # im = image.squeeze(0).permute(1,2,0), (64, 48)
    x = preds[0][:,0] *(256/64),
    y = preds[0][:,1] *(192/48)
    plt.figure(figsize=(15, 8))
    plt.imshow(temp)
    plt.scatter(x, y, c="r")
    plt.show()

predictions from the other model you have uploaded image

from my dataset. trained for 30 epochs only and you can see it doing well also to new images image

I am starting to think the weights from the a3 might be off. Can you check and see from your side. Thanks

yangsenius commented 2 years ago

It seems that your input images are (H=192, W=256) sizes but the pretrained models are trained with (H=256, W=192) sizes. It may be better to make the height of the image large than the width.

mukeshnarendran7 commented 2 years ago

Hey I tried that initially but the model throws back an error if i load (256, 192) input images