Why Can't I Reproduce This Results?

alexw994 commented 10 months ago

On V-coco, I trained using the commands in the repository and only achieved a result of 64.1. When I tested with the 'eval' command, the MAP was only 65.9.

---------Reporting Role AP (%)------------------ hold-obj: AP = 58.44 (#pos = 3608) sit-instr: AP = 59.50 (#pos = 1916) ride-instr: AP = 73.68 (#pos = 556) look-obj: AP = 48.34 (#pos = 3347) hit-instr: AP = 80.03 (#pos = 349) hit-obj: AP = 69.51 (#pos = 349) eat-obj: AP = 71.48 (#pos = 521) eat-instr: AP = 76.79 (#pos = 521) jump-instr: AP = 77.71 (#pos = 635) lay-instr: AP = 58.62 (#pos = 387) talk_on_phone-instr: AP = 56.47 (#pos = 285) carry-obj: AP = 48.91 (#pos = 472) throw-obj: AP = 57.31 (#pos = 244) catch-obj: AP = 57.66 (#pos = 246) cut-instr: AP = 50.66 (#pos = 269) cut-obj: AP = 65.60 (#pos = 269) work_on_computer-instr: AP = 77.11 (#pos = 410) ski-instr: AP = 56.09 (#pos = 424) surf-instr: AP = 80.34 (#pos = 486) skateboard-instr: AP = 88.40 (#pos = 417) drink-instr: AP = 59.21 (#pos = 82) kick-obj: AP = 79.46 (#pos = 180) point-instr: AP = 8.20 (#pos = 31) read-obj: AP = 51.02 (#pos = 111) snowboard-instr: AP = 80.16 (#pos = 277) Average Role [scenario_1] AP = 63.63 Average Role [scenario_1] AP = 65.94, omitting the action "point"

---------Reporting Role AP (%)------------------ hold-obj: AP = 61.83 (#pos = 3608) sit-instr: AP = 62.22 (#pos = 1916) ride-instr: AP = 74.57 (#pos = 556) look-obj: AP = 53.29 (#pos = 3347) hit-instr: AP = 81.17 (#pos = 349) hit-obj: AP = 71.86 (#pos = 349) eat-obj: AP = 75.43 (#pos = 521) eat-instr: AP = 77.01 (#pos = 521) jump-instr: AP = 78.17 (#pos = 635) lay-instr: AP = 61.32 (#pos = 387) talk_on_phone-instr: AP = 58.56 (#pos = 285) carry-obj: AP = 50.48 (#pos = 472) throw-obj: AP = 59.77 (#pos = 244) catch-obj: AP = 62.53 (#pos = 246) cut-instr: AP = 51.62 (#pos = 269) cut-obj: AP = 67.81 (#pos = 269) work_on_computer-instr: AP = 78.73 (#pos = 410) ski-instr: AP = 61.23 (#pos = 424) surf-instr: AP = 80.91 (#pos = 486) skateboard-instr: AP = 88.89 (#pos = 417) drink-instr: AP = 59.94 (#pos = 82) kick-obj: AP = 83.20 (#pos = 180) point-instr: AP = 8.24 (#pos = 31) read-obj: AP = 56.72 (#pos = 111) snowboard-instr: AP = 81.60 (#pos = 277) Average Role [scenario_2] AP = 65.88 Average Role [scenario_2] AP = 68.29, omitting the action "point"

Is my understanding of the metrics incorrect? Thank you very much for the reply.

OreoChocolate commented 10 months ago

Hi, thank you for your interest in our research. We have experienced that It is unstable to train the transformer for one-stage HOI detection. I recommend that you train the model several times by changing the hyper-parameters (e.g., lr, seed)

alexw994 commented 10 months ago

Thank you for the reply, so my steps are all correct, for example, here:

Average Role [scenario_1] AP = 65.94, omitting the action "point"

and

Average Role [scenario_2] AP = 68.29, omitting the action "point"

correctly correspond to 68.8 and 71.0 as mentioned in the paper, right?

OreoChocolate commented 10 months ago

Yes, we report results of omitting the action "point", following previous works [STIP,HOTR].

OreoChocolate / MUREN

Why Can't I Reproduce This Results? #5