open-mmlab / mmpose

OpenMMLab Pose Estimation Toolbox and Benchmark.
https://mmpose.readthedocs.io/en/latest/
Apache License 2.0
5.82k stars 1.24k forks source link

Using RLE loss train on custom data #2117

Open annnnt opened 1 year ago

annnnt commented 1 year ago

Hi,I am trying to train a model that will generate keypoints(my own data on upper-body) depending on body-deeppose res50_coco_256x192_rle .When I start training the model, I get good acc_pose values however when the evaluation is done (after 90 epochs), the resulting metrics are very poor: 2023-03-24 16:27:12,484 - mmpose - INFO - Epoch [90][2300/2588] lr: 1.000e-03, eta: 8:03:08, time: 0.090, data_time: 0.001, memory: 466, reg_loss: -194.6303, acc_pose: 0.9610, loss: -194.6303 2023-03-24 16:27:16,972 - mmpose - INFO - Epoch [90][2350/2588] lr: 1.000e-03, eta: 8:03:03, time: 0.090, data_time: 0.001, memory: 466, reg_loss: -204.8977, acc_pose: 0.9880, loss: -204.8977 2023-03-24 16:27:21,463 - mmpose - INFO - Epoch [90][2400/2588] lr: 1.000e-03, eta: 8:02:58, time: 0.090, data_time: 0.001, memory: 466, reg_loss: -211.2530, acc_pose: 0.9590, loss: -211.2530 2023-03-24 16:27:25,950 - mmpose - INFO - Epoch [90][2450/2588] lr: 1.000e-03, eta: 8:02:53, time: 0.090, data_time: 0.000, memory: 466, reg_loss: -212.4019, acc_pose: 0.9742, loss: -212.4019 2023-03-24 16:27:30,463 - mmpose - INFO - Epoch [90][2500/2588] lr: 1.000e-03, eta: 8:02:48, time: 0.090, data_time: 0.001, memory: 466, reg_loss: -215.1714, acc_pose: 0.9795, loss: -215.1714 2023-03-24 16:27:34,959 - mmpose - INFO - Epoch [90][2550/2588] lr: 1.000e-03, eta: 8:02:44, time: 0.090, data_time: 0.001, memory: 466, reg_loss: -189.7282, acc_pose: 0.9620, loss: -189.7282 2023-03-24 16:27:38,887 - mmpose - INFO - Saving checkpoint at 90 epochs [>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 648/648, 21.7 task/s, elapsed: 30s, ETA: 0sres_file D:/data/mmpose-master/mmpose-master/work_dirs/Temp/result_keypoints.json Loading and preparing results... DONE (t=0.07s) creating index... index created! Running per image evaluation... Evaluate annotation type keypoints DONE (t=0.18s). Accumulating evaluation results... DONE (t=0.01s). Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets= 20 ] = 0.002 Average Precision (AP) @[ IoU=0.50 | area= all | maxDets= 20 ] = 0.005 Average Precision (AP) @[ IoU=0.75 | area= all | maxDets= 20 ] = 0.002 Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets= 20 ] = 0.000 Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets= 20 ] = 0.002 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 20 ] = 0.006 Average Recall (AR) @[ IoU=0.50 | area= all | maxDets= 20 ] = 0.014 Average Recall (AR) @[ IoU=0.75 | area= all | maxDets= 20 ] = 0.005 Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets= 20 ] = 0.000 Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets= 20 ] = 0.007

What can be the possible reason for this?

Tau-J commented 1 year ago

Thanks for using MMPose. As you mentioned that the acc_pose indicator during training is good, I suggest checking the visualization of results for single image inference first to determine if this is a bug in the evaluation script. Additionally, could you provide more information about your dataset, such as the amount of data in the train set and test set, and the application scenario?

annnnt commented 1 year ago

Thanks for your reply.I have tested on images using trained model, but the result is still poor The dataset has about 3000 images in total (2400 training and 600 validation,val set is the same as test set). The background of my dataset is relatively simple and the images is highly similar. I wonder whether this will have an impact on trained model?