nwojke / deep_sort

Simple Online Realtime Tracking with a Deep Association Metric
GNU General Public License v3.0
5.19k stars 1.46k forks source link

generating GT annotations for deep sort? #236

Closed AbdurahmaanYuusuf closed 3 years ago

AbdurahmaanYuusuf commented 3 years ago

hello, i am planning to integrate deep sort into my custom yolov4 detector trained on top view people dataset. can anyone please tell me how to generate GT annotations for deep sort so that i can evaluate MOT metrics on my custom dataset?

DiegoAnas commented 3 years ago

The paper for the MOT challenge has the information about the format of the labelled data, both for detections (DET) and ground truth (GT), check it out https://arxiv.org/abs/2003.09003 . I copy-paste the info on the tables:

1 Frame number Indicate at which frame the object is present
2 Identity number Each pedestrian trajectory is identified by a unique ID (−1for detections)
3 Bounding box left Coordinate of the top-left corner of the pedestrian bounding box
4 Bounding box top Coordinate of the top-left corner of the pedestrian bounding box
5 Bounding box width Width in pixels of the pedestrian bounding box
6 Bounding box heigh tHeight in pixels of the pedestrian bounding box
7 Confidence score; DET: Indicates how confident the detector is that this instance is a pedestrian. GT: It acts as a flag whether the entry is to be considered (1) or ignored (0).
8 Class GT: Indicates the type of object annotated
9 Visibility GT: Visibility ratio, a number between 0 and 1 that says how much of that object is visible. Can be due to occlusion and due to image border cropping.

If you use some labelling software like label-img then I think you have to find a way to convert your YOLO detections and GTs into this MOT format.

annezao commented 3 years ago

The paper for the MOT challenge has the information about the format of the labelled data, both for detections (DET) and ground truth (GT), check it out https://arxiv.org/abs/2003.09003 . I copy-paste the info on the tables:

1 Frame number Indicate at which frame the object is present
2 Identity number Each pedestrian trajectory is identified by a unique ID (−1for detections)
3 Bounding box left Coordinate of the top-left corner of the pedestrian bounding box
4 Bounding box top Coordinate of the top-left corner of the pedestrian bounding box
5 Bounding box width Width in pixels of the pedestrian bounding box
6 Bounding box heigh tHeight in pixels of the pedestrian bounding box
7 Confidence score; DET: Indicates how confident the detector is that this instance is a pedestrian. GT: It acts as a flag whether the entry is to be considered (1) or ignored (0).
8 Class GT: Indicates the type of object annotated
9 Visibility GT: Visibility ratio, a number between 0 and 1 that says how much of that object is visible. Can be due to occlusion and due to image border cropping.

If you use some labelling software like label-img then I think you have to find a way to convert your YOLO detections and GTs into this MOT format.

Do you know why the det.txt provided by DPM has confidence score greater than 1 and even greater than 2?

AbdurahmaanYuusuf commented 3 years ago

cvat.org which is develeoped by INTEL supports many annoation formats including Motmetrics