abewley / sort

Simple, online, and realtime tracking of multiple objects in a video sequence.
GNU General Public License v3.0
3.82k stars 1.07k forks source link

change input and output variables from [xmin, ymin, xmax, ymax ] to [x, y, w, h] #141

Open marfis89 opened 2 years ago

marfis89 commented 2 years ago

Hi i am using Alexeys darknet python wrapper. It would be nice if i can use [x, y, w, h] as input and also output variables for the bounding boxes, instead off [ xmin, ymin, xmax, ymax ].

input is not that hard:

` dets.append([x, y, w, h, confidence, detectionClassID])

def convert_bbox_to_z(bbox):

    x = bbox[0]
    y = bbox[1]
    w = bbox[2]
    h = bbox[3]    

    # w = bbox[2] - bbox[0]
    # h = bbox[3] - bbox[1]
    # x = bbox[0] + w / 2.
    # y = bbox[1] + h / 2.

    s = w * h  # scale is just area
    r = w / float(h)
    return np.array([x, y, s, r]).reshape((4, 1))`

but i am not sure for the output / return variables ( convert_x_to_bbox ). Could you give me a little hink?

Bests Regards Martin

ehdrndd commented 2 years ago

In the paper,

state x = [u,v,s,r,'u,'v,'s]^T

u,v means center x,y coordinates in pixels. s = area r = box w,h ratio (0~1)