Fine-tuning model does'nt learn while training

mukeshnarendran7 commented 2 years ago

. I am using these parameters found in the main git but Fine-tuning model doesn't seem to learn from new data

criterion = torch.nn.MSELoss(reduction="mean")
pretrain_part = [param for name, param in model.named_parameters()if 'final_layer' not in name]
optimizer = torch.optim.Adam([ {'params': pretrain_part, 'lr':1e-5 },

model loading

#Load pre-trained model and fine-tune
model = torch.hub.load('yangsenius/TransPose:main', 
                       'tpr_a4_256x192',
                       pretrained=True)
for param in model.parameters():
    param.requires_grad = False

model.final_layer =  torch.nn.Sequential(torch.nn.Conv2d(256, 18,1))                                    
model = model.to(device)

#Get model summary
summary(model,
          input_size=(3, 256, 192),
          batch_size=1
          )

Loss: Epoch:0, time22.861s, loss0.49620020389556885 Epoch:1, time23.730s, loss0.28467278741300106 Epoch:2, time23.186s, loss0.19562865188345313 Epoch:3, time23.089s, loss0.15849934425204992 Epoch:4, time23.006s, loss0.1431811167858541 Epoch:5, time22.985s, loss0.1366584706120193 Epoch:6, time23.165s, loss0.13364548794925213 Epoch:7, time23.167s, loss0.13212952483445406 Epoch:8, time22.938s, loss0.13107420597225428 Epoch:9, time22.993s, loss0.13025714457035065 Epoch:10, time22.988s, loss0.12950784899294376 Epoch:11, time22.960s, loss0.12873529829084873 Epoch:12, time22.971s, loss0.12812337931245565 Epoch:13, time23.206s, loss0.12742456002160907 Epoch:14, time23.180s, loss0.12706445762887597

yangsenius commented 2 years ago

for param in model.parameters():
    param.requires_grad = False

Why setting requires_grad= False? This will freeze all parameters

mukeshnarendran7 commented 2 years ago

I was trying to use only the final layer as a feature extractor and this was causing the problem. It is still the same with the new dataset now after removing the frozen layers setting (for param in model.parameters(): param.requires_grad = True).

When i train for many epochs then the networks loss goes down and all but then the predictions are off even on the training set. I have trained for more than 100 epochs and is till the same Epoch:0, loss0.7756728883832693, time taken:14.116s Epoch:1, loss0.7343646492809057, time taken:13.354s Epoch:2, loss0.703302716370672, time taken:13.306s Epoch:3, loss0.6708760429173708, time taken:13.363s Epoch:4, loss0.6438880995847285, time taken:13.364s Epoch:5, loss0.6167034558020532, time taken:13.387s Epoch:6, loss0.5937230880372226, time taken:13.265s Epoch:7, loss0.569108291529119, time taken:13.304s Epoch:8, loss0.5474147114437073, time taken:13.370s Epoch:9, loss0.5261419415473938, time taken:13.291s Epoch:10, loss0.5088018793612719, time taken:13.319s Epoch:11, loss0.49403429706580937, time taken:13.259s Epoch:12, loss0.47912345826625824, time taken:13.361s Epoch:13, loss0.46565791056491435, time taken:13.328s Epoch:14, loss0.4500332986935973, time taken:13.257s Epoch:15, loss0.439005748834461, time taken:13.261s Epoch:16, loss0.4257299837190658, time taken:13.302s

When you use the pre-trained model with MPII dataset are you adding a few extra layers and freezing some of the previous layers and training?
How many epochs are you fine-tuning your model on the MPII dataset?
Would there be a good reference of the list of parameters i should opt for when fine-tuning the model ? Or even a reference for preparing the pre-trained model would be helpful?
Is there something else to consider while fine-tuning the model?

Thanks

yangsenius commented 2 years ago

Nope. I made very simple modification. Just replace the final layer 1x1 Conv(d, 17) with 1x1 Conv(d, 16). No freezing for previous layers but 1e-5 fixed learning rate for the pretrained layers.
100 epochs
Just like this https://github.com/yangsenius/TransPose/issues/31#issuecomment-1041048153. I made very simple changes
Make sure that you have correctly loaded the COCO pretrained weights for the pretrained parts (excluding the final layer) AND correct learning rate settings

mukeshnarendran7 commented 2 years ago

Hey Thanks for getting back

Load pretrained weights from url: https://github.com/yangsenius/TransPose/releases/download/Hub/tp_r_256x192_enc4_d256_h1024_mh8.pth

100% 22.9M/22.9M [00:00<00:00, 84.4MB/s]

Successfully loaded model (on cpu) with pretrained weights

In point 4, I believe that if i download from torch hub then it should load with the pre-trained coco weights automatically or something else needs to be done

mukeshnarendran7 commented 2 years ago

Just to share with you on how the model learns when i train on my data the outcomes looks like. Any idea why it might be?

Some of my target heatmaps are like this within the frame but i get these very small values whereas my minimum of the heatmaps should be zero

yangsenius commented 2 years ago

Are the positions of the GT keypoints right? From the target heatmap, it seems they are not matched with the input image.

mukeshnarendran7 commented 2 years ago

Hey, The above images are not related but you can see in this one I have plotted the kps over the image and they match and its in all the cases of images that they match

I tried with another dataset and you can see after training for 100 epochs it's not able to generalize to my data.

I tried to use the pre-trained weight to load the model directly to plot on an image from the internet and you can see that it is not able to predict.

#Load pre-trained model and fine-tune
model_tp = torch.hub.load('yangsenius/TransPose:main', 
                       'tpr_a4_256x192',
                       pretrained=True)
 for name,param in model_tp.named_parameters():
       param.requires_grad = True

model = model_tp
#training params
criterion = torch.nn.MSELoss(reduction="mean")
pretrain_part = [param for name, param in model.named_parameters()if 'final_layer' not in name]
optimizer = torch.optim.Adam([ {'params': pretrain_part, 'lr':1e-5 },
                                                     {'params': model.final_layer.parameters(), 'lr': 1e-4}])

#pre-trained mean/std
mean=[0.485, 0.456, 0.406]
std=[0.229, 0.224, 0.225]

trfm = transforms.Compose([
        transforms.ToTensor(),
        transforms.Normalize(mean=mean,
                            std=std)])
model.eval()
with torch.no_grad():

    image = cv2.imread("/content/RN.jpg",1) #image resize
    image = cv2.resize(image, (256, 192))
    temp  = image #for plotting purposes
    image = trfm(image) # normalize, to tensor
    image = image[None].float()
    output = model(image)
    #Get keypoints (used function from yangseius github repo)
    preds, maxvals = get_max_preds(output.clone().cpu().numpy())

    # im = image.squeeze(0).permute(1,2,0), (64, 48)
    x = preds[0][:,0] *(256/64),
    y = preds[0][:,1] *(192/48)
    plt.figure(figsize=(15, 8))
    plt.imshow(temp)
    plt.scatter(x, y, c="r")
    plt.show()

Predictions:

Am I loading the model wrong? I referenced the Pytorch docs and they seem okay. If you write a demo code on how to load it from torch.hub() for pretraining would be helpful. Thanks

Here's mine

#Load pre-trained model and fine-tune
model_tp = torch.hub.load('yangsenius/TransPose:main', 
                       'tpr_a4_256x192',
                       pretrained=True)
for name,param in model_tp.named_parameters():
    param.requires_grad = True

model_tp.final_layer =  torch.nn.Sequential(torch.nn.Conv2d(256, 16, kernel_size=1))                                    
model = model_tp.to(device)

#Get model summary
summary(model,
          input_size=(3, 256, 192),
          batch_size=1
          )

for pretraining i referred to this tutorial: https://pytorch.org/tutorials/beginner/finetuning_torchvision_models_tutorial.html

mukeshnarendran7 commented 2 years ago

Hey sorry to spam you so many time but i have a good update.

#Load pre-trained model and fine-tune
model_tp = torch.hub.load('yangsenius/TransPose:main',
                          'tph_a4_256x192',
                          pretrained=True)
# for name,param in model_tp.named_parameters():
#     param.requires_grad = True

model = model_tp

#pre-trained mean/std
mean=[0.485, 0.456, 0.406]
std=[0.229, 0.224, 0.225]

trfm = transforms.Compose([
        transforms.ToTensor(),
        transforms.Normalize(mean=mean,
                            std=std)])
model.eval()
with torch.no_grad():

    image = cv2.imread("/content/messi.jpg",1) #image resize
    image = cv2.resize(image, (256, 192))
    temp  = image #for plotting purposes
    image = trfm(image) # normalize, to tensor
    image = image[None].float()
    output = model(image.to('cpu'))
    #Get keypoints (used function from yangseius github repo)
    preds, maxvals = get_max_preds(output.clone().cpu().numpy())

    # im = image.squeeze(0).permute(1,2,0), (64, 48)
    x = preds[0][:,0] *(256/64),
    y = preds[0][:,1] *(192/48)
    plt.figure(figsize=(15, 8))
    plt.imshow(temp)
    plt.scatter(x, y, c="r")
    plt.show()

predictions from the other model you have uploaded

from my dataset. trained for 30 epochs only and you can see it doing well also to new images

I am starting to think the weights from the a3 might be off. Can you check and see from your side. Thanks

yangsenius commented 2 years ago

It seems that your input images are (H=192, W=256) sizes but the pretrained models are trained with (H=256, W=192) sizes. It may be better to make the height of the image large than the width.

mukeshnarendran7 commented 2 years ago

Hey I tried that initially but the model throws back an error if i load (256, 192) input images

yangsenius / TransPose

Fine-tuning model does'nt learn while training #32