Zhongdao / Towards-Realtime-MOT

Joint Detection and Embedding for fast multi-object tracking
MIT License
2.37k stars 539 forks source link

Problem with re-ID features and fuse_motion #218

Closed gouttham closed 3 years ago

gouttham commented 3 years ago
for row, track in enumerate(tracks):
    gating_distance = kf.gating_distance(track.mean, track.covariance, measurements, only_position, metric='maha’)
    cost_matrix[row, gating_distance > gating_threshold] = np.inf
    cost_matrix[row] = lambda_ * cost_matrix[row] + (1 - lambda_) * gating_distance

file_name: matching.py method: fuse_motion

Query

  1. track - Previously detected tracker object

  2. Measurements - it gives bounding boxes of all the objects in current frames.

  3. gating distance - quantifies the distance between on such previously identified box vs all the boxed in current frame. ex: [ 0.00027311 58.948 207.32] This operation is purely done by kalman filter.

  4. cost_matrix - It is the confusion_matrix created by the 128 dim embedding generated by the network. Dot product between embedding of tracks vs current_boundingbox_detections ex: [[0.01109,0.41579,1.10747] [0.42742,0.03098,0.84188]] tracked objects from previous frames (2 objects) vs current frames object (3 objects)

  5. Here in line 3 instead of finding the min of cosine distance from confusion_matrix they have indexed it using the gating_distance(provided by kalman filter) thus rendering the distance metric useless.

  6. Here apart from the places indexed by gating_distance everything will be infinity ex: cost_matrix = [[0.01109,inf,inf] [inf,0.03098,inf]] Since values are set to inf the linear_assignment task will always provides the obvious outputs.

Am is missing something ? It would be great if you could shed some clarity on this. Thanks

Zhongdao commented 3 years ago

Re 1-4: Yes you are right. Re 5,6: Here we use a gating_threshold to filter out those impossible matches, typically very far-away observations. In your given case the persons may be far away from other persons, so the motion distance is larger than the threshold and then most matches are seen as impossible. The cost_matrix does not always look like this (with an obvious linea_assignment solution).

gouttham commented 3 years ago

Thanks for the reply.