yangsenius / TransPose

PyTorch Implementation for "TransPose: Keypoint localization via Transformer", ICCV 2021.
https://github.com/yangsenius/TransPose/releases/download/paper/transpose.pdf
MIT License
353 stars 56 forks source link

Pretrained model loss stuck around a point? how many epochs to train model? #38

Open mukeshnarendran7 opened 2 years ago

mukeshnarendran7 commented 2 years ago

How many epochs was the Transpose hA4 pre-trained model fine-tuned on the MPII dataset to get to the benchmarks in the paper? I am using: the following parameters similar to the paper. but on a dataset with 10K images here model_tp = torch.hub.load('yangsenius/TransPose:main', 'tph_a4_256x192', pretrained=True) model_tp.final_layer = torch.nn.Sequential(torch.nn.Conv2d(96, 18, kernel_size=1))

Load parameters

model = model_tp.to(device) pretrain_part = [param for name, param in model.named_parameters()if 'final_layer' not in name] optimizer = torch.optim.Adam([ {'params': pretrain_part, 'lr':1e-5 }, {'params': model.final_layer.parameters(), 'lr': 1e-4}]) criterion = torch.nn.MSELoss(reduction="mean")

Any suggestion to improve would be helpful this situation. Thanks I am training trying to fine-tune it but the loss doesn't decrease: Training model Epoch:0, loss2.804723664186895, time taken:539.878s Epoch:1, loss2.263692114269361, time taken:542.564s Epoch:2, loss1.8802592728752643, time taken:542.661s Epoch:3, loss1.5531523590907454, time taken:543.041s Epoch:4, loss1.3379272652091458, time taken:543.445s Epoch:5, loss1.1180460024625063, time taken:538.449s Epoch:6, loss0.9673018065514043, time taken:534.550s Epoch:7, loss0.8572808737517335, time taken:538.618s Epoch:8, loss0.7790990431094542, time taken:535.940s Epoch:9, loss0.7243237162474543, time taken:536.291s Epoch:10, loss0.6794152171351016, time taken:535.745s Epoch:11, loss0.6420647234190255, time taken:532.800s Epoch:12, loss0.6094503253116272, time taken:531.308s Epoch:13, loss0.5824214839958586, time taken:530.418s Epoch:14, loss0.5580684408778325, time taken:530.618s Epoch:15, loss0.538073766452726, time taken:531.255s Epoch:16, loss0.5198041790281422, time taken:531.875s Epoch:17, loss0.5046796562382951, time taken:529.682s Epoch:18, loss0.49001771898474544, time taken:529.585s Epoch:19, loss0.4768067048571538, time taken:530.031s Epoch:20, loss0.46674167667515576, time taken:534.574s Epoch:21, loss0.45518148655537516, time taken:532.242s Epoch:22, loss0.4449854488193523, time taken:532.336s Epoch:23, loss0.4369037283177022, time taken:533.899s Epoch:24, loss0.4278696861874778, time taken:532.454s Epoch:25, loss0.4207416394201573, time taken:538.248s Epoch:26, loss0.41212902366532944, time taken:541.508s Epoch:27, loss0.4052599307906348, time taken:540.419s Epoch:28, loss0.3998840279818978, time taken:541.615s Epoch:29, loss0.3926734702545218, time taken:541.612s Epoch:30, loss0.3866453653026838, time taken:541.235s Epoch:31, loss0.38077057831105776, time taken:540.944s Epoch:32, loss0.37572325009386986, time taken:540.582s Epoch:33, loss0.3709150122012943, time taken:540.616s Epoch:34, loss0.36646912069409154, time taken:540.807s Epoch:35, loss0.3614582328009419, time taken:541.298s Epoch:36, loss0.35673171386588365, time taken:537.836s Epoch:37, loss0.3524343741883058, time taken:538.538s Epoch:38, loss0.34845523245166987, time taken:539.272s

gaobo25 commented 2 years ago

I have another problem When I train

Test: [0/125] Time 0.854 (0.854) Loss 0.0014 (0.0014) Accuracy 0.000 (0.000) Test: [100/125] Time 0.107 (0.124) Loss 0.0016 (0.0013) Accuracy 0.008 (0.008)

Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets= 20 ] = 0.000 Average Precision (AP) @[ IoU=0.50 | area= all | maxDets= 20 ] = 0.000 Average Precision (AP) @[ IoU=0.75 | area= all | maxDets= 20 ] = 0.000 Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets= 20 ] = 0.000 Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets= 20 ] = 0.000 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 20 ] = 0.000 Average Recall (AR) @[ IoU=0.50 | area= all | maxDets= 20 ] = 0.000 Average Recall (AR) @[ IoU=0.75 | area= all | maxDets= 20 ] = 0.000 Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets= 20 ] = 0.000 Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets= 20 ] = 0.000

May I ask what is the reason for this

adnantariq18 commented 2 years ago

I m getting -1.000 in all .. can anyone tell the reason ..

electroram commented 1 year ago

因为你需要把人体检测框标出来。他这个模型实际上是把人体检测框标出来之后才开始用Transformer进行关键点检测的。