Closed thehummingbird closed 1 year ago
Hi @thehummingbird, I'm happy to hear that you're using Norfair!
Norfair was designed to track objects using detections as a series of (x, y)
coordinates (centroid, bounding boxes, pose estimation keypoints, etc.). To achieve this, Norfair compares the position of tracked objects to new detections appearing in the following frames and attempts to match them. This can be done by using a distance_function
such as the "euclidean"
or "iou"
. In this common scenario, visual information is not used in Norfair.
Besides the predefined distances, the user can also define a custom distance_function
, where a more complex comparison can be engineered. In this custom function, the user may leverage other types of data, such as visual embeddings. This may be useful in scenarios with occlusions, where positional information may not be enough.
Lastly, we've also developed an initial version of the ReID algorithm, where an additional step is introduced to handle heavy occlusions. This is also where embeddings may come in handy.
Great. Thank you for the clarification!
From the README, I get that norfair also uses deep embeddings. But in my use case, I provide only centroids from my yolo detector and norfair does a good job. While my system works, I'm confused as to how norfair is working if I don't provide the image data at all. Can you clarify please?