Closed Maqingyang closed 5 years ago
For the hyperparameters you should use the values reported in the paper. Also I recommend using data augmentation.
Hi Nikos,
Thanks for sharing your great job in the paper. I have the same confusion with --num__epochs & --dataset parameter as @Maqingyang.
You mentioned that
"Training with data only from Human3.6M lasts for 10 epochs, while mixed training with data from Human3.6M and UP3D requires training for 25 epochs, because of the greater image diversity"
As you didn't specify that you used a two-stage training strategy, I guess that you might use different epochs on different datasets. That is,
You train 10 epochs when only on Human3.6M, and 25 epochs when on the mixed datasets with Human3.6M and UP3D. As the default setting, then you train 50 epochs on the mixed datasets with Human3.6M, UP3D, MPII, COCO, LSP.
I might have a misunderstanding. It would be great if you help to correct it. A more detailed explanation of your training strategy will be awesome. Thanks! @nkolot
@Maqingyang Do you have any luck to figure out the proper parameter settings? Thanks!
In fact I follow the paper's hyperparameter, but have't produce the scores the paper reported. Maybe the author could provide more details about training process and help to reproduce the scores. If you have reproduce the scores, please let me know, thanks! @yuxwind
What are the scores you are getting?
h36m-p1 MPJPE (NonParam): 139.3199177315659 Reconstruction Error (NonParam): 59.20808539927326 MPJPE (Param): 95.11126801614222 Reconstruction Error (Param): 56.31886701213058
h36m-p2 MPJPE (NonParam): 137.5788705966492 Reconstruction Error (NonParam): 56.405959431544595 MPJPE (Param): 91.3564615155462 Reconstruction Error (Param): 53.2527044260017
In README it said that when choosing training hyper-params, the default values are the ones used to train the models in the paper. However I found there are some differences between code default settings and those in the paper.
--batch_size In default settings it is 2. However , it is 16 in the paper.
--lr In default settings it is 2.4e-4. However, it is 3e-4 in the paper.
--num__epochs & --dataset In default settings you use 50 epochs and 'itw' option which excludes Human3.6M dataset. However, in the paper you talked about a two-stage training strategy.
--rot_factor & --noise_factor & --scale_factor In default settings these options are used, but they are not mentioned in the paper.
I feel a little confused about which hyper-params set to choose. Could you help me here to better reproduce the results in the paper?