Closed mukeshnarendran7 closed 2 years ago
for param in model.parameters():
param.requires_grad = False
Why setting requires_grad= False? This will freeze all parameters
I was trying to use only the final layer as a feature extractor and this was causing the problem. It is still the same with the new dataset now after removing the frozen layers setting (for param in model.parameters(): param.requires_grad = True).
When i train for many epochs then the networks loss goes down and all but then the predictions are off even on the training set. I have trained for more than 100 epochs and is till the same Epoch:0, loss0.7756728883832693, time taken:14.116s Epoch:1, loss0.7343646492809057, time taken:13.354s Epoch:2, loss0.703302716370672, time taken:13.306s Epoch:3, loss0.6708760429173708, time taken:13.363s Epoch:4, loss0.6438880995847285, time taken:13.364s Epoch:5, loss0.6167034558020532, time taken:13.387s Epoch:6, loss0.5937230880372226, time taken:13.265s Epoch:7, loss0.569108291529119, time taken:13.304s Epoch:8, loss0.5474147114437073, time taken:13.370s Epoch:9, loss0.5261419415473938, time taken:13.291s Epoch:10, loss0.5088018793612719, time taken:13.319s Epoch:11, loss0.49403429706580937, time taken:13.259s Epoch:12, loss0.47912345826625824, time taken:13.361s Epoch:13, loss0.46565791056491435, time taken:13.328s Epoch:14, loss0.4500332986935973, time taken:13.257s Epoch:15, loss0.439005748834461, time taken:13.261s Epoch:16, loss0.4257299837190658, time taken:13.302s
Thanks
1x1 Conv(d, 17)
with 1x1 Conv(d, 16)
. No freezing for previous layers but 1e-5 fixed learning rate for the pretrained layers.Hey Thanks for getting back
Load pretrained weights from url: https://github.com/yangsenius/TransPose/releases/download/Hub/tp_r_256x192_enc4_d256_h1024_mh8.pth
100% 22.9M/22.9M [00:00<00:00, 84.4MB/s]
Successfully loaded model (on cpu) with pretrained weights
Just to share with you on how the model learns when i train on my data the outcomes looks like. Any idea why it might be?
Some of my target heatmaps are like this within the frame but i get these very small values whereas my minimum of the heatmaps should be zero
Are the positions of the GT keypoints right? From the target heatmap, it seems they are not matched with the input image.
Hey, The above images are not related but you can see in this one I have plotted the kps over the image and they match and its in all the cases of images that they match
I tried with another dataset and you can see after training for 100 epochs it's not able to generalize to my data.
I tried to use the pre-trained weight to load the model directly to plot on an image from the internet and you can see that it is not able to predict.
#Load pre-trained model and fine-tune
model_tp = torch.hub.load('yangsenius/TransPose:main',
'tpr_a4_256x192',
pretrained=True)
for name,param in model_tp.named_parameters():
param.requires_grad = True
model = model_tp
#training params
criterion = torch.nn.MSELoss(reduction="mean")
pretrain_part = [param for name, param in model.named_parameters()if 'final_layer' not in name]
optimizer = torch.optim.Adam([ {'params': pretrain_part, 'lr':1e-5 },
{'params': model.final_layer.parameters(), 'lr': 1e-4}])
#pre-trained mean/std
mean=[0.485, 0.456, 0.406]
std=[0.229, 0.224, 0.225]
trfm = transforms.Compose([
transforms.ToTensor(),
transforms.Normalize(mean=mean,
std=std)])
model.eval()
with torch.no_grad():
image = cv2.imread("/content/RN.jpg",1) #image resize
image = cv2.resize(image, (256, 192))
temp = image #for plotting purposes
image = trfm(image) # normalize, to tensor
image = image[None].float()
output = model(image)
#Get keypoints (used function from yangseius github repo)
preds, maxvals = get_max_preds(output.clone().cpu().numpy())
# im = image.squeeze(0).permute(1,2,0), (64, 48)
x = preds[0][:,0] *(256/64),
y = preds[0][:,1] *(192/48)
plt.figure(figsize=(15, 8))
plt.imshow(temp)
plt.scatter(x, y, c="r")
plt.show()
Predictions:
Am I loading the model wrong? I referenced the Pytorch docs and they seem okay. If you write a demo code on how to load it from torch.hub() for pretraining would be helpful. Thanks
Here's mine
#Load pre-trained model and fine-tune
model_tp = torch.hub.load('yangsenius/TransPose:main',
'tpr_a4_256x192',
pretrained=True)
for name,param in model_tp.named_parameters():
param.requires_grad = True
model_tp.final_layer = torch.nn.Sequential(torch.nn.Conv2d(256, 16, kernel_size=1))
model = model_tp.to(device)
#Get model summary
summary(model,
input_size=(3, 256, 192),
batch_size=1
)
for pretraining i referred to this tutorial: https://pytorch.org/tutorials/beginner/finetuning_torchvision_models_tutorial.html
Hey sorry to spam you so many time but i have a good update.
#Load pre-trained model and fine-tune
model_tp = torch.hub.load('yangsenius/TransPose:main',
'tph_a4_256x192',
pretrained=True)
# for name,param in model_tp.named_parameters():
# param.requires_grad = True
model = model_tp
#pre-trained mean/std
mean=[0.485, 0.456, 0.406]
std=[0.229, 0.224, 0.225]
trfm = transforms.Compose([
transforms.ToTensor(),
transforms.Normalize(mean=mean,
std=std)])
model.eval()
with torch.no_grad():
image = cv2.imread("/content/messi.jpg",1) #image resize
image = cv2.resize(image, (256, 192))
temp = image #for plotting purposes
image = trfm(image) # normalize, to tensor
image = image[None].float()
output = model(image.to('cpu'))
#Get keypoints (used function from yangseius github repo)
preds, maxvals = get_max_preds(output.clone().cpu().numpy())
# im = image.squeeze(0).permute(1,2,0), (64, 48)
x = preds[0][:,0] *(256/64),
y = preds[0][:,1] *(192/48)
plt.figure(figsize=(15, 8))
plt.imshow(temp)
plt.scatter(x, y, c="r")
plt.show()
predictions from the other model you have uploaded
from my dataset. trained for 30 epochs only and you can see it doing well also to new images
I am starting to think the weights from the a3 might be off. Can you check and see from your side. Thanks
It seems that your input images are (H=192, W=256) sizes but the pretrained models are trained with (H=256, W=192) sizes. It may be better to make the height of the image large than the width.
Hey I tried that initially but the model throws back an error if i load (256, 192) input images
. I am using these parameters found in the main git but Fine-tuning model doesn't seem to learn from new data
model loading
Loss: Epoch:0, time22.861s, loss0.49620020389556885 Epoch:1, time23.730s, loss0.28467278741300106 Epoch:2, time23.186s, loss0.19562865188345313 Epoch:3, time23.089s, loss0.15849934425204992 Epoch:4, time23.006s, loss0.1431811167858541 Epoch:5, time22.985s, loss0.1366584706120193 Epoch:6, time23.165s, loss0.13364548794925213 Epoch:7, time23.167s, loss0.13212952483445406 Epoch:8, time22.938s, loss0.13107420597225428 Epoch:9, time22.993s, loss0.13025714457035065 Epoch:10, time22.988s, loss0.12950784899294376 Epoch:11, time22.960s, loss0.12873529829084873 Epoch:12, time22.971s, loss0.12812337931245565 Epoch:13, time23.206s, loss0.12742456002160907 Epoch:14, time23.180s, loss0.12706445762887597