tryolabs / norfair

Lightweight Python library for adding real-time multi-object tracking to any detector.
https://tryolabs.github.io/norfair/
BSD 3-Clause "New" or "Revised" License
2.41k stars 247 forks source link

Does Norfair need image data for deep association (like DeepSORT)? #261

Closed thehummingbird closed 1 year ago

thehummingbird commented 1 year ago

From the README, I get that norfair also uses deep embeddings. But in my use case, I provide only centroids from my yolo detector and norfair does a good job. While my system works, I'm confused as to how norfair is working if I don't provide the image data at all. Can you clarify please?

facundo-lezama commented 1 year ago

Hi @thehummingbird, I'm happy to hear that you're using Norfair!

Norfair was designed to track objects using detections as a series of (x, y) coordinates (centroid, bounding boxes, pose estimation keypoints, etc.). To achieve this, Norfair compares the position of tracked objects to new detections appearing in the following frames and attempts to match them. This can be done by using a distance_function such as the "euclidean" or "iou". In this common scenario, visual information is not used in Norfair.

Besides the predefined distances, the user can also define a custom distance_function, where a more complex comparison can be engineered. In this custom function, the user may leverage other types of data, such as visual embeddings. This may be useful in scenarios with occlusions, where positional information may not be enough.

Lastly, we've also developed an initial version of the ReID algorithm, where an additional step is introduced to handle heavy occlusions. This is also where embeddings may come in handy.

thehummingbird commented 1 year ago

Great. Thank you for the clarification!