robert80203 / HuPR-A-Benchmark-for-Human-Pose-Estimation-Using-Millimeter-Wave-Radar

The official implementation of HuPR: A Benchmark for Human Pose Estimation Using Millimeter Wave Radar
86 stars 17 forks source link

Can you provide me with detailed information on the training time of your model and the amount of memory occupied by the training set after preprocessing? #11

Open chenhanxin123 opened 1 year ago

chenhanxin123 commented 1 year ago

After downloading the dataset, I preprocessed it and obtained nearly 3T of data. Is this normal? Afterwards, I followed the instructions on GitHub and retrained a model from scratch on all 276 datasets, with a default sr of 1. One epoch requires up to 30 hours of training time. Afterwards, I set the SR to 10, and one epoch also took 3 hours. The training time was very long. I trained 20 epochs, but the accuracy of the obtained model was not even 10%. I read in the paper that the author used single Tesla V100. I am using a single A6000. So I would like to know the detailed information of the author when training the model. I hope you can take the time out of your busy schedule to help me, and I would greatly appreciate it.

robert80203 commented 1 year ago

In our experiment, it took a day to converge with -sr 10 (about 10 to 15 epochs). Did you replace the original coco.py and cocoeval.py? You may also visualize the results to see if the model is really converged.

chenhanxin123 commented 1 year ago

I replaced the coco.py and cocoeval.py in my environment with misc/coco.py and misc/cocoeval.py according to the requirements on GitHub. Specifically, I replaced the coco.py and cocoeval.py in the/home/chenhanxin/anaconda3/envs/p37/lib/python3.7/site packages/copytools/path with misc/coco.py and misc/cocoeval.py, respectively. Because I noticed that CoCo uses 17 joint points by default, while the author uses 14 joint points. If not replaced, an error will be reported. I set SR to 10, and each epoch takes three hours. After three days of training, the AP value obtained is as follows: AP: 0.047, AP. 5: 0.090 and AP. 75:0.040. There is a tenfold difference compared to the author's results. Also, I downloaded the model from GitHub_ Best.pth evaluated on the author's default test set and obtained results such as AP: 0.649, Ap. 5:0.982, and AP. 75: 0.782.

chenhanxin123 commented 1 year ago

Can you share with me the training logs and other detailed files of the author's model? My email is chenhanxin@emails.bjut.edu.cn

XIN499 commented 9 months ago

Hello, I had the same problem while training. The AP after training is about tenfold difference compared to the author's. Have you fixed the problem ?

I replaced the coco.py and cocoeval.py in my environment with misc/coco.py and misc/cocoeval.py according to the requirements on GitHub. Specifically, I replaced the coco.py and cocoeval.py in the/home/chenhanxin/anaconda3/envs/p37/lib/python3.7/site packages/copytools/path with misc/coco.py and misc/cocoeval.py, respectively. Because I noticed that CoCo uses 17 joint points by default, while the author uses 14 joint points. If not replaced, an error will be reported. I set SR to 10, and each epoch takes three hours. After three days of training, the AP value obtained is as follows: AP: 0.047, AP. 5: 0.090 and AP. 75:0.040. There is a tenfold difference compared to the author's results. Also, I downloaded the model from GitHub_ Best.pth evaluated on the author's default test set and obtained results such as AP: 0.649, Ap. 5:0.982, and AP. 75: 0.782.

ydhgethub commented 2 weeks ago

@chenhanxin123 你好,可以分享一下数据集吗