NirAharon / BoT-SORT

BoT-SORT: Robust Associations Multi-Pedestrian Tracking
MIT License
885 stars 422 forks source link

FPS issue #37

Open C-monC opened 1 year ago

C-monC commented 1 year ago

Hi,

Weights: yolov7.pt Faiss gpu input video resolution 1920 × 1080 gpu 2060 super

I am getting extremely slow fps. The actual inference step is very fast. This line online_targets = tracker.update(detections, im0) takes 1.2 seconds. Is this expected? My GPU utilization is about 25% CPU utilization is also around 20%

mikel-brostrom commented 1 year ago

I have the same issue.

https://github.com/NirAharon/BoT-SORT/issues/39

But way more extreme in my case. BoTSORT is not something I would use for real-time applications.

C-monC commented 1 year ago

Ah strange, OC_SORT gets nearly 700 fps (no inference) on cpu and the implementation looks so similar.

I assume oc_sort doesn't compensate for camera motion? What's your findings for the most performant real time tracking algorithm @mikel-brostrom?

mikel-brostrom commented 1 year ago

What's your findings for the most performant real time tracking algorithm @mikel-brostrom?

Both OCSORT and ByteTrack are absolutely the fastest, achieving around 800FPS on a mere Intel® Core™ i5-4210U CPU. For best MOT metrics while real-time (20-30 FPS) it is still StrongSORT, at least based on my own MOT16 eval results:

Yolov5 StrongSORT

HOTA: StrongSORT                   HOTA      DetA      AssA      DetRe     DetPr     AssRe     AssPr     LocA      RHOTA     HOTA(0)   LocA(0)   HOTALocA(0)
COMBINED                           52.948    51.068    55.46     55.639    75.608    59.992    80.202    81.989    55.447    68.353    76.972    52.613    

CLEAR: StrongSORT                  MOTA      MOTP      MODA      CLR_Re    CLR_Pr    MTR       PTR       MLR       sMOTA     CLR_TP    CLR_FN    CLR_FP    IDSW      MT        PT        ML        Frag          
COMBINED                           60.265    79.402    60.705    67.147    91.246    31.915    47.776    20.309    46.434    74135     36272     7112      486       165       247       105       2164      

Identity: StrongSORT               IDF1      IDR       IDP       IDTP      IDFN      IDFP       
COMBINED                           66.889    58.056    78.893    64098     46309     17149

FPS: 20

Yolov5 BoTSORT (no camera motion compensation, the implemented ones: ecc and orb, are too expensive computationally for real-time applications)

Used parameters for BoTSORT can be found here:

HOTA: BoTSORT                      HOTA      DetA      AssA      DetRe     DetPr     AssRe     AssPr     LocA      RHOTA     HOTA(0)   LocA(0)   HOTALocA(0)
COMBINED                           52.943    51.653    54.766    56.387    75.51     60.55     76.832    82.008    55.485    68.441    77.037    52.725

CLEAR: BoTSORT                     MOTA      MOTP      MODA      CLR_Re    CLR_Pr    MTR       PTR       MLR       sMOTA     CLR_TP    CLR_FN    CLR_FP    IDSW      MT        PT        ML        Frag          
COMBINED                           60.993    79.48     61.52     68.097    91.192    34.816    47.195    17.988    47.019    75184     35223     7262      582       180       244       93        2111  

Identity: BoTSORT                  IDF1      IDR       IDP       IDTP      IDFN      IDFP       
COMBINED                           66.54     58.114    77.823    64162     46245     18284  

FPS: 3

NirAharon commented 1 year ago

Hi @C-monC and @mikel-brostrom, sorry for the late response I wasn't available for the last month.

1) BoT-SORT can achieve real-time. I created a c++ real-life implementation that I, unfortunately, can't publish here. The ORB and ECC method are indeed slower and thus I used the c++ OpenCV VideoStab GMC method, as you can see in the paper. Currently, there is no python implementation for this module, this is why I write the GMC result to files and read it in runtime. For demo arbitrary videos (not from the MOTChallenge) I enabled the ECC and the ORB option.

2) I just added another CMC method which is following the OpenCV GMC code and it is based on sparse Optical Flow. This CMC method is running at ~16 mSec for a 1920x1080 frame on my computer.

3) As I mentioned in the paper, the CMC module should run in parallel to the detector, thus the extra 16 mSec could be reduced to nearly zero latency (because the detector usually takes most of the runtime). Therefore, the overall runtime should be governed by the detector time (YOLOv5 in your case) and the ReID module if it is enabled. I hope I will have time soon to create a multi-threaded implementation in python which run the detector and the CMC in parallel.

4) By the way, BoT-SORT (with correct CMC in parallel and without ReID) and ByteTrack should take the same runtime. And also, BoT-SORT-ReID and StrongSORT should have the same runtime, especially because StrongSORT used ECC and ReID (if I remember correctly).

In any case, thank you for your interest in this work and I hope I was able to help you.

minyong-cho commented 1 year ago

Hi @NirAharon.

Could you provide the real-time version of BoT-SORT?

Thank you!

viplix3 commented 8 months ago

The BoT-SORT indeed is real-time.

It is the CMC that takes up most of the computation time. For reference, you can check this.

When CMC is disabled, BoT-SORT is able to achieve 60+ FPS on Jetson NX in a sequence having an average of 226 objects/image.

spoonbobo commented 3 months ago

The BoT-SORT indeed is real-time.

It is the CMC that takes up most of the computation time. For reference, you can check this.

When CMC is disabled, BoT-SORT is able to achieve 60+ FPS on Jetson NX in a sequence having an average of 226 objects/image.

How to disable CMC? And how will the accuracy be impacted by this?