zhezh / adafuse-3d-human-pose

[IJCV-2020] AdaFuse: Adaptive Multiview Fusion for Accurate Human Pose Estimation in the Wild
81 stars 12 forks source link

Results different from Paper / Readme #2

Closed FilipHesse closed 3 years ago

FilipHesse commented 3 years ago

Dear @zhezh,

I tried out evaluating your neural network. While intstalling/ data preparation I followed exactly the steps from the README.

When running the command

python run/adafuse/adafuse_main.py --cfg experiments/h36m/h36m_4view.yaml --evaluate true

my results are by far not as good as the README suggests:

['j3d_NoFuse', 'j3d_HeuristicFuse', 'j3d_ScoreFuse', 'j3d_ransac', 'j3d_AdaFuse']
Ep:0[0/2151]    Speed 1.6 samples/s Data 0.171s (0.171s)    Loss 6.20910 (6.20910)  Acc 0.382 (0.382)   Memory 4.56G    MPJPEs 30.2|30.3|30.1|30.2|30.0|
Ep:0[100/2151]  Speed 17.1 samples/s    Data 0.003s (0.023s)    Loss 6.29442 (5.92007)  Acc 0.250 (0.284)   Memory 4.56G    MPJPEs 33.2|31.9|31.6|32.1|31.2|
Ep:0[200/2151]  Speed 16.2 samples/s    Data 0.003s (0.023s)    Loss 3.56413 (5.33599)  Acc 0.235 (0.286)   Memory 4.56G    MPJPEs 34.2|32.8|32.6|33.1|32.3|
Ep:0[300/2151]  Speed 12.2 samples/s    Data 0.013s (0.022s)    Loss 4.44329 (5.18986)  Acc 0.368 (0.283)   Memory 4.56G    MPJPEs 34.1|32.7|32.5|32.8|32.3|
Ep:0[400/2151]  Speed 14.8 samples/s    Data 0.003s (0.021s)    Loss 5.43990 (5.23304)  Acc 0.000 (0.268)   Memory 4.56G    MPJPEs 34.2|32.8|32.9|33.0|32.7|
Ep:0[500/2151]  Speed 11.6 samples/s    Data 0.003s (0.021s)    Loss 3.87065 (5.17308)  Acc 0.324 (0.268)   Memory 4.56G    MPJPEs 34.4|32.8|32.7|33.0|32.5|
Ep:0[600/2151]  Speed 15.8 samples/s    Data 0.003s (0.020s)    Loss 2.69123 (5.19301)  Acc 0.206 (0.274)   Memory 4.56G    MPJPEs 34.5|32.9|32.7|33.1|32.5|
Ep:0[700/2151]  Speed 12.6 samples/s    Data 0.003s (0.020s)    Loss 3.14112 (5.05705)  Acc 0.279 (0.278)   Memory 4.56G    MPJPEs 35.2|33.3|33.0|33.9|32.7|
Ep:0[800/2151]  Speed 14.7 samples/s    Data 0.002s (0.020s)    Loss 5.26931 (4.90297)  Acc 0.206 (0.271)   Memory 4.56G    MPJPEs 36.6|34.3|33.5|34.9|33.2|
Ep:0[900/2151]  Speed 15.3 samples/s    Data 0.003s (0.020s)    Loss 3.54874 (4.88593)  Acc 0.279 (0.271)   Memory 4.56G    MPJPEs 36.5|34.2|33.3|34.8|32.9|
Ep:0[1000/2151] Speed 16.7 samples/s    Data 0.003s (0.020s)    Loss 5.57522 (4.88952)  Acc 0.000 (0.267)   Memory 4.56G    MPJPEs 36.5|34.4|33.5|34.9|33.1|
Ep:0[1100/2151] Speed 15.7 samples/s    Data 0.003s (0.020s)    Loss 4.98989 (4.92151)  Acc 0.176 (0.264)   Memory 4.56G    MPJPEs 36.4|34.4|33.5|34.9|33.1|
Ep:0[1200/2151] Speed 13.6 samples/s    Data 0.003s (0.020s)    Loss 5.18068 (4.93347)  Acc 0.235 (0.262)   Memory 4.56G    MPJPEs 36.1|34.2|33.4|34.8|33.1|
Ep:0[1300/2151] Speed 15.7 samples/s    Data 0.002s (0.020s)    Loss 6.30360 (5.00692)  Acc 0.397 (0.267)   Memory 4.56G    MPJPEs 35.3|33.5|32.8|34.0|32.4|
Ep:0[1400/2151] Speed 14.9 samples/s    Data 0.003s (0.020s)    Loss 5.33229 (5.05066)  Acc 0.235 (0.276)   Memory 4.56G    MPJPEs 34.4|32.7|32.0|33.2|31.6|
Ep:0[1500/2151] Speed 11.6 samples/s    Data 0.003s (0.020s)    Loss 5.23665 (5.07361)  Acc 0.250 (0.282)   Memory 4.56G    MPJPEs 33.6|31.9|31.2|32.4|30.9|
Ep:0[1600/2151] Speed 10.6 samples/s    Data 0.003s (0.020s)    Loss 3.79300 (5.05285)  Acc 0.456 (0.286)   Memory 4.56G    MPJPEs 33.1|31.4|30.7|32.0|30.3|
Ep:0[1700/2151] Speed 14.8 samples/s    Data 0.003s (0.020s)    Loss 3.89500 (5.05508)  Acc 0.588 (0.293)   Memory 4.56G    MPJPEs 32.5|30.8|30.2|31.4|29.8|
Ep:0[1800/2151] Speed 13.1 samples/s    Data 0.003s (0.020s)    Loss 3.32175 (4.98535)  Acc 0.368 (0.300)   Memory 4.56G    MPJPEs 32.3|30.5|29.8|31.2|29.4|
Ep:0[1900/2151] Speed 13.4 samples/s    Data 0.003s (0.020s)    Loss 5.02961 (4.97106)  Acc 0.265 (0.303)   Memory 4.56G    MPJPEs 32.0|30.2|29.5|30.9|29.0|
Ep:0[2000/2151] Speed 14.5 samples/s    Data 0.003s (0.020s)    Loss 5.41441 (4.99512)  Acc 0.265 (0.307)   Memory 4.56G    MPJPEs 31.6|29.8|29.1|30.6|28.6|
Ep:0[2100/2151] Speed 11.5 samples/s    Data 0.003s (0.020s)    Loss 6.87233 (4.99751)  Acc 0.324 (0.310)   Memory 4.56G    MPJPEs 31.2|29.5|28.7|30.2|28.3|
Ep:0[2150/2151] Speed 14.6 samples/s    Data 0.003s (0.020s)    Loss 5.91654 (5.00815)  Acc 0.338 (0.310)   Memory 4.56G    MPJPEs 30.9|29.2|28.5|29.9|28.0|
MPJPE summary: j3d_NoFuse 30.94
MPJPE summary: j3d_HeuristicFuse 29.24
MPJPE summary: j3d_ScoreFuse 28.52
MPJPE summary: j3d_ransac 29.93
MPJPE summary: j3d_AdaFuse 28.05

I was expecting this output:

MPJPE summary: j3d_NoFuse 22.94
MPJPE summary: j3d_HeuristicFuse 21.02
MPJPE summary: j3d_ScoreFuse 20.14
MPJPE summary: j3d_ransac 21.77
MPJPE summary: j3d_AdaFuse 19.54

Any idea, what the error could be? My environment is a conda environment that I created from your provided commands. Tell me what kind of additional information you need. Thank you!

Best,

Filip Hesse

zhezh commented 3 years ago

Hi @FilipHesse , one possible reason might be that the different meaning of joints_3d in this work and chunyu's toolkit. In this work, joints_3d is in camera frame. I am not sure if I transformed the joints_3d after generating the data.

https://github.com/zhezh/adafuse-3d-human-pose/blob/42be2b63fe265f950d1b608bcedcab28cf0cecd6/lib/dataset/joints_dataset.py#L201

https://github.com/zhezh/adafuse-3d-human-pose/blob/42be2b63fe265f950d1b608bcedcab28cf0cecd6/lib/dataset/multiview_h36m.py#L199

I will check this.

zhezh commented 3 years ago

In chunyu's toolkit https://github.com/CHUNYUWANG/H36M-Toolbox/blob/dce619f4d997ea218a14c33c86a3455c5d118c95/generate_labels.py#L112

It seems joints_3d is in global frame. Would you pls change the joints_3d to joints_3d_camera of the two lines in last comments, and try again. Thanks

FilipHesse commented 3 years ago

Thank you so much, it works perfectly now!