mit-han-lab / litepose

[CVPR'22] Lite Pose: Efficient Architecture Design for 2D Human Pose Estimation
https://hanlab.mit.edu
MIT License
306 stars 37 forks source link

Problem with model training #40

Open nightingale233 opened 4 months ago

nightingale233 commented 4 months ago

I think your paper's mind is pretty good and wanted to have a train myself. I downloaded litepose source code and use "Normal Training" method to trained a mobileNet based Search-S model on Crowdpose dataset for 500 epoches. However,I only got 46.6 AP instead of 58.3 AP on CrowdPose Evaluation(as Fig 1 shows).

2ef1965e83ffd94d052757f62d54793

I have changed INPUT_SIZE from 256 to 448 and OUTPUT_SIZE from [64,128] to [112,224] as your NOTE said.And to make training faster I changed IMAGES_PER_GPU from 16 to 32, GPU_EORKERS from 4 to 16. And that is all. I also trained a Search-XS model base on mobileNet and got 46.4 AP instead of 49.3 AP on CrowdPose Evaluation(as Fig 2 shows).

89436a44d0d4cc6a4ea82cfae7ec88b

I tried to output both official LitePose-Auto-S and my trained model in onnx format and used Netron to see the difference between each other but found they are totally the same. My training env is on CUDA 11.8, torch 2.0.0 on AutoDL Server(1 RTX3090 24GB GPU). Would someone help me fix this problem? Thank you very much!!!

nightingale233 commented 4 months ago

My mistake. I will do supernet train first and then have a test.