camera motion estimation failed

GeekAlexis / FastMOT

High-performance multiple object tracking based on YOLO, Deep SORT, and KLT 🚀

MIT License

1.14k stars 254 forks source link

camera motion estimation failed #59

Closed aditdoshi333 closed 3 years ago

aditdoshi333 commented 3 years ago

Hello, I am using a stereo setup for object detection. And running FastMOT on a single stream. If I am running the right camera stream I am able to see the bounding boxes and everything is working fine.

But when I am using the left stream I am getting LOGGER.warning('Camera motion estimation failed'). It is getting from here https://github.com/GeekAlexis/FastMOT/blob/2851316a2d8c87923bd9a2e58c55e24676558288/fastmot/flow.py#L130

I tried reducing inlier_thresh to 1. But still the same. In the right and left stream, the background is almost similar.

Thank you

GeekAlexis commented 3 years ago

Hi, two things you can try:

Lower the feature detector threshold (e.g. 10) https://github.com/GeekAlexis/FastMOT/blob/2851316a2d8c87923bd9a2e58c55e24676558288/cfg/mot.json#L73
Increase the background scaling factor (e.g. [0.2, 0.2]) https://github.com/GeekAlexis/FastMOT/blob/2851316a2d8c87923bd9a2e58c55e24676558288/cfg/mot.json#L65

aditdoshi333 commented 3 years ago

Um it is not working. I tried to reduce bg_feat_thresh to 1. And bg_feat_scale_factor to 1,1.

I want to know what kind of background is expected?

GeekAlexis commented 3 years ago

The background needs to have an abundance of corner points. Make sure your camera is in focus and the lighting is good. If the background is blurred out or covered by a large object, no feature points can be detected. Like I said in #52, if you are shooting a video of yourself up close, it is not going to work. FYI, that error doesn't normally appear on regular videos, it is usually your camera setup or image noise that causes the issue.

aditdoshi333 commented 3 years ago

Okay, thanks for the help. I Will try with a different background.

Tetsujinfr commented 3 years ago

Noise/poor lightning definitely is a challenge on this based on my tests (even with good focus). I am not close to this implementation, but is it the deepsort which is very sensitive to noise? Asking because yolov4 is quite robust from that perspective I think.

GeekAlexis commented 3 years ago

@Tetsujinfr Deep SORT in its original form is robust to noise. I'm talking about feature point detection for optical flow, which is used for camera motion compensation.

Tetsujinfr commented 3 years ago

ok got you. I do observe quite some id instabilities though even when the camera is fixed but I guess it is a challenging topic. I just hoped deepsort would do a better job at predicting targets position and keeping track.

GeekAlexis commented 3 years ago

There will be ID switches regardless of camera movement. Performance of Deep SORT also heavily depends on the quality of the detector and feature extractor.

Tetsujinfr commented 3 years ago

On this repo, you are using Yolov4 right? Which net resolution is used? Is it the regular net or the tiny version?

GeekAlexis commented 3 years ago

The regular YOLOv4 with 512x512 resolution. But the repo allows you to use your own. I can't guarantee the network was trained perfectly, it serves as a demo.

Tetsujinfr commented 3 years ago

I assume you use the same trt generated weights from the tensor demo repo by @jkjung-avt right?

Can you point me on where in the code base is the appearance descriptor extracted before feeding deepsort?

Btw, your repo is great, I am just probably too demanding. Besides I got depth info available so I would love to use it to improve the tracking further, but that is on me of course.

GeekAlexis commented 3 years ago

Yes to your first question.

https://github.com/GeekAlexis/FastMOT/blob/2851316a2d8c87923bd9a2e58c55e24676558288/fastmot/mot.py#L103-L105

I would imagine with depth, you can use a 3D Kalman filter to keep track of the z axis of the bounding box center. Tracking in 3D space is going to reduce ID switches by a lot.

Please open another issue if you have further questions.

kumurule commented 3 years ago

I assume you use the same trt generated weights from the tensor demo repo by @jkjung-avt right?

Can you point me on where in the code base is the appearance descriptor extracted before feeding deepsort?

Btw, your repo is great, I am just probably too demanding. Besides I got depth info available so I would love to use it to improve the tracking further, but that is on me of course.

Were you able to include the depth information in tracking?