levan92 / deep_sort_realtime

A really more real-time adaptation of deep sort
MIT License
156 stars 46 forks source link

Order of 'track.to_ltrb()' values #35

Closed shamindraparui closed 1 year ago

shamindraparui commented 1 year ago

I am using the tracker as follows:


tracker = DeepSort(max_age=5)

while (capture.isOpened()):
    ret, frame = capture.read()
    if not ret:
        break
    else:
        result = model.detect([frame], verbose=0)[0] # MRCNN detector
        number_of_detections = len(result['rois'])

        # making bbs structure as required by the tracker
        bbs = []
        for i in range(number_of_detections):
            y1, x1, y2, x2 = result['rois'][i]
            scr = result['scores'][i]
            cls = result['class_ids'][i]
            a_tuple = ([x1, y1, x2, y2], scr, cls)
            bbs.append(a_tuple)

        if len(bbs) > 0:
            tracks = tracker.update_tracks(bbs, frame=frame)
            for track in tracks:
                if not track.is_confirmed():
                    continue
                track_id = track.track_id
                ltrb = track.to_ltrb(orig=True)
                x_1, y_1, x_2, y_2 = ltrb
                frame = cv2.rectangle(frame, (int(x_1), int(y_1)), (int(x_2), int(y_2)), (0,0,255), 2)
                frame = cv2.putText(frame, track_id, (int(x_1), int(y_1)), cv2.FONT_HERSHEY_SIMPLEX, font_scale, (0,0,255), thickness)
        resized = cv2.resize(frame, (width - 10, height - 10))
        output.write(resized) # writing video frames to disk
capture.release()
output.release()

By manually debugging the ltrb = track.to_ltrb() I can see that it is returning bbox coordinates. Can you please tell me the order of the bbox coordinates (i.e. is each of them in x1, y1, x2, y2 order)? And why tracked bboxes are larger than original?

levan92 commented 1 year ago

Yes, ltrb stands for left top right bottom, aka (x1, y1, x2, y2).

The function you call return state predictions by Kalman filter by default. If you want to get back the original bbox, please flag orig=True. See https://github.com/levan92/deep_sort_realtime#getting-bounding-box-of-original-detection.

shamindraparui commented 1 year ago

Yes, I found the mistake, I was passing the coordinates as (x1, y1, x2, y2) instead of (x1, y1, w,h). Thanks @levan92