tryolabs / norfair

Lightweight Python library for adding real-time multi-object tracking to any detector.
https://tryolabs.github.io/norfair/
BSD 3-Clause "New" or "Revised" License
2.34k stars 237 forks source link

Ask a question about object tracking #313

Closed andy940255 closed 3 months ago

andy940255 commented 3 months ago

Hello every developers, I am new to the field of object detection. Due to time constraints, please allow me to ask a few questions before fully understanding this project. Thank you.

I want to find a way to implement the moving object detection using a moving camera and apply it to a vehicle. I simply want to detect moving objects near the vehicle and issue a warning. Due to demand, I would like to ask about the tracking function of this project.

Is there a way to know the moving speed and depth (distance from the camera) of an object?

andy940255 commented 3 months ago

I want do like this video: https://www.youtube.com/shorts/eksFR_fZXNg

aguscas commented 3 months ago

Hello @andy940255 !

Regarding the ways to estimate the distance to the camera (depth), that is not something that Norfair should be handling. Instead, your models should be able to return the depth information. There are several approaches for that. If you are working with a single camera you might find useful the monocular depth estimation models, we have a demo about that. There are also ways for measuring depth with stereo pairs of cameras (the formula in this site returns the distance from the real life object to the middle point between both cameras), which I believe should be more accurate than the models using a single camera. You can then use that distance to create 3d coordinates that reflect the position of the objects In the real world (instead of the 2d coordinates of the object in the plane of the image), and use those coordinates to track the 3d objects with Norfair, as it was done in our 3d tracking demo.

Now, regarding the velocity estimation, Norfair can actually return the velocity vector of the objects with the TrackedObject.estimate_velocity method, but that velocity will be relative to the velocity of the cameras (i.e: _relative_velocity = absolute_velocity - cameravelocity), so the _cameravelocity would have to be added to get the _actualvelocity. If you don't have that information (the _cameravelocity), then you could try estimating it by computing the relative velocity of many arbitrary points (most pixels in a video correspond to things that don't have intrinsic movement, for example buildings, trees, parked cars, ...), and their relative velocity should be actually the opposite of the _cameravelocity (since their _absolutevelocity is zero).

We have done similar things in 2D tracking to compensate for the movement of the camera in our camera_motion demo, by using the optical flow of arbitrary sampled pixels in the image. It might be interesting to have this feature for 3d tracking as well, so we might work on it in the future.

andy940255 commented 3 months ago

Hello @aguscas:

Thank you very much for your prompt response and for the introduction and information you provided, which have been very helpful to me.

Currently, I plan to use a single-lens setup, focusing on 2D imagery. I will proceed according to the directions you've given, and I will provide feedback or address any issues that arise during the process.

Once again, thank you for your assistance. I will close this issue now. Thank you.