timmeinhardt / trackformer

Implementation of "TrackFormer: Multi-Object Tracking with Transformers”. [Conference on Computer Vision and Pattern Recognition (CVPR), 2022]
https://arxiv.org/abs/2101.02702
Apache License 2.0
487 stars 113 forks source link

Thoughts poor performance? #103

Open tadeegan opened 1 year ago

tadeegan commented 1 year ago

animated

I hacked up the repo to get it working on CPU but it seems to be performing rather poorly compared to the demo videos on the MOTX datasets. I suppose this is probably because it needs training data? I am also getting lots of detections spawning at 0,0 in the top left corner. One of the significant changes I made was using the pytorch deformable attention vs the cuda one (I will try the cuda version at some point later). Specifically this change: https://github.com/timmeinhardt/trackformer/compare/main...tadeegan:trackformer:main#diff-0a7118a9af9ab8f5a592a27995a64ef51a24ef966788e2b6b44dc4f889883131 Do you think that would be problematic?

Would this repo accept PRs to upgrade its dependencies for recent python and pytorch versions?

tadeegan commented 1 year ago

https://imgur.com/a/yVBzVKy

timmeinhardt commented 1 year ago

I think you are facing potentially two issues:

  1. Generalization to a different data domain (soccer videos). However, I think our model should handle this better than in the video above. Even without finetuning to your data.
  2. The boxes in the top left corner indicate that something is going wrong. The CPU code for deformalbe attention is not tested. If you have a GPU you should try to get it working for GPUs.