performance issue - Githubissues

VlSomers / keypoint_promptable_reidentification

[ECCV24] Keypoint Promptable Re-Identification: SOTA ReID method robust to occlusions and multi-person ambiguity

Other

86 stars 4 forks source link

performance issue #7

Open ujin0415 opened 3 weeks ago

ujin0415 commented 3 weeks ago

Hi, @VlSomers ! I'm very interested in your research so I ran your default code based on Solider and Occluded-Duke dataset twice. And then I faced some weird performance improvements after the first training. The mAP of the first training was 67.45% and 73.47% in the second trial which is slightly below the performance on the paper. Also, the convergence speed was much faster in the second trial. The settings were all the same in both trials. I'm so wondering how this phenomenon can happen. Are there any related settings in the training process?

VlSomers commented 3 weeks ago

Hi @ujin0415 , thank you for your interest, I would be happy to help you with your issues! I never experienced what you are describing, there is of course some variance from one run to the other (maybe +/-1 %), but never on that scale. Can you make a third run to see if the training is more stable now? There is no related settings in the training that I can think of that can explain your issue unfortunately. However, as I explained in the repo, the codebase has undergone a major refactoring before the public release, especially for the configuration system, so maybe there is some difference between the released configs and the one I used for the experiences in the paper. If you cannot reproduce the performance, can you share your configs and logs, and I will see if there is some misconfigured parameters. Let me know if you solved your issue!

ujin0415 commented 3 weeks ago

Thank you for your kind reply! I tried several more runs but instability still exists. Unfortunately, we didn't record the first two runs. So I attach our log file(It only contains 49 epochs :( ) for the last run. output.txt

Thank you so much!

VlSomers commented 3 weeks ago

Can you share the full logs of an instable run? Did you also try to run the Swin Imagenet version? Fine-tuning the SOLIDER backbone has always been more difficult than fine-tuning the Imagenet one, I'm therefore wondering if your instabilities are related to SOLIDER. Unfortunately I'm very busy right now so I won't have the time to reproduce you experiments until late November. However, I recently made some runs with a private fork of this public repository, I everything went smoothly, so this might also be an issue related to your environment.