mikel-brostrom / boxmot

BoxMOT: pluggable SOTA tracking modules for segmentation, object detection and pose estimation models
GNU Affero General Public License v3.0
6.79k stars 1.72k forks source link

How to fix jitter when reading an https YouTube stream #579

Closed HerneSong closed 2 years ago

HerneSong commented 2 years ago

Search before asking

Question

Hello, I have some issues when tracking a live stream from a youtube URL. The resolution and fps for the video is 1920*1080 and 30fps. I did run python track.py --source 'myURL' --save-vid using yolov5x.pt weights plus osnet_x0_25_msmt17.pt strongsort weights.

During inference, yolo speed is (0.022s, 0.023s) and strongsort speed is (0.030s, 0.066s) depending on the density of objects in the frame i guess. As you can see total processing time of a frame is (0.05s, 0.09s), the fps for tracking would be 10 to 20. And the saved video is extremely laggy, frames freeze and move again. The video is not continuously played frame by frame as the original stream. The saved video's fps is correct(30fps), but the frames just freezes for seconds and updates late.

https://user-images.githubusercontent.com/94005149/198721083-4bae20a2-8518-4767-8f11-8d27eb345d0b.mp4

I am running the program on a AWS ec2 server, here is my settings image

Here is my gpu usage while running the program image

And CPU usage

image

My question is what is my bottleneck to track this stream video? I am guessing it's because my cpu is overwhelmed by strongsort. How do i fix this?

Another additional question is that, is there a way to stream the tracking video in a URL instead of the local window. Therefore, I could display the tracking stream in a different machine instead of locally.

Thanks for Mikel for your wonderful work. Any help would be appreciated!

mikel-brostrom commented 2 years ago

Thanks for Mikel for your wonderful work.

:smile:

Have you tried to change the tracking method used?

python track.py --tracking-method ocsort --source 'myURL' --save-vid

OCSORT is a pure CPU tracking method, not as good as StrongSORT but probably good enough for your needs. Could you report back with the generated video?

HerneSong commented 2 years ago

Thanks mikel! I just tried the new tracking method as you told me, the time used by tracking method is indeed significantly decreased, which I believe now should catch up with the stream fps. However, the output for both of the streaming video in the window and the saved video is still laggy. The frame seems only updates every a couple of seconds.

Here is the new output video. https://user-images.githubusercontent.com/94005149/198907820-88b9f485-660e-4e69-92af-cf5b697de083.mp4

Maybe there is some issue with the video writer?

Again thanks for your suggestion.

mikel-brostrom commented 2 years ago

My hypothesis then is that the jitter is most likely due to network limitations and occurs when a frame packet is dropped. When a set of frame packets are dropped, the stream stops and then restarts again when a packet is received. Because this is not an issue with an .mp4 or any other local video source.

Check out this solution suggestion from stackoverflow (https://stackoverflow.com/questions/55099413/python-opencv-streaming-from-camera-multithreading-timestamps/55131226):

Using threading to handle I/O heavy operations (such as reading frames from a webcam) is a classic programming model. Since accessing the webcam/camera using cv2.VideoCapture().read() is a blocking operation, our main program is stalled until the frame is read from the camera device and returned to our script. Essentially the idea is to spawn another thread to handle grabbing the frames in parallel instead of relying on a single thread (our 'main' thread) to grab the frames in sequential order. This will allow frames to be continuously read from the I/O thread, while our root thread processes the current frame. Once the root thread finishes processing its frame, it simply needs to grab the current frame from the I/O thread without having to wait for blocking I/O operations.

mikel-brostrom commented 2 years ago

Jeffrey Jex's answer here is good @HerneSong : https://stackoverflow.com/questions/43032163/how-to-read-youtube-live-stream-using-opencv-python

HerneSong commented 2 years ago

Hello. I just tried the

Jeffrey Jex's answer here is good @HerneSong : https://stackoverflow.com/questions/43032163/how-to-read-youtube-live-stream-using-opencv-python

Hello, I just tried this. From my understanding, this method saves a temporary clip from the stream before detecting and tracking. Essentially, it is the same as running track.py on a local file. But I want to constantly do the tracking on a server, doing tasks like counting the number of persons and cars. Is there a better solution for this?

And also I did speed test on my machine. The speed should be fast enough to load a stream.

Thanks.

mikel-brostrom commented 2 years ago

I tried to get a stable stream myself yesterday using pafy and cv2 but the results where really poor, with a lot of jitter, like the ones you post. The answer at least saved a stream with stable FPS and no jitter which makes me think that pafy is not the way to go. Follow my stackoverflow question regarding this here

HerneSong commented 2 years ago

Really appreciate your efforts mikel. Yes, as you said, pafy might not be the right way to go. However, I just found that one way to solve this is to change a little bit in the original yolov5's LoadStreams class. In the update method:

`def update(self, i, cap, stream):

Read stream i frames in daemon thread

    n, f, read = 0, self.frames[i], 1  # frame number, frame array, inference every 'read' frame
    while cap.isOpened() and n < f:
        n += 1
        # _, self.imgs[index] = cap.read()
        cap.grab()
        if n % read == 0:
            success, im = cap.retrieve()
            if success:
                self.imgs[i] = im        
                time.sleep(1/30) # 1/fps of your input stream #this is the line i added
            else:
                LOGGER.warning('WARNING: Video stream unresponsive, please check your IP camera connection.')
                self.imgs[i] = np.zeros_like(self.imgs[i])
                cap.open(stream)  # re-open stream if signal was lost
        time.sleep(0.0)  # wait time`

After sleep every interval (1 / fps second), all of a sudden, the jitter problem is disappeared.

Now for me, the biggest issue is that my cpu bottleneck to process fast enough as the stream fps. And also, if your tracking fps (10- 15 fps) is less than the desired output (for instance, 30 fps), your saved video is like 2x speeded up.

I believe the only way to solve this is either to sacrifice the tracking performance using more naive tracking methods like ocsort or to update my cpu. Cuz cv2.VideoWriter() won't dynamically change your fps to remain the same timestamp as the original timestamp, right?

HerneSong commented 2 years ago

Sorry about the format. The line I added is highlighted in the editor. image

mikel-brostrom commented 2 years ago

After sleep every interval (1 / fps second), all of a sudden, the jitter problem is disappeared.

Sometimes the simplest solutions are the best solutions :smile:

cv2.VideoWriter() won't dynamically change your fps to remain the same timestamp as the original timestamp, right?

Nope, you would have to pass the loop time to the update function above

I believe the only way to solve this is either to sacrifice the tracking performance using more naive tracking methods like ocsort or to update my cpu

You could also skip every 2nd frame and set time.sleep(1/15)

mikel-brostrom commented 2 years ago

Then, in that same function I guess you could change

n, f, read = 0, self.frames[i], 1

to (or whatever maximum number of FPS that you CPU can handle per second):

inference_every_nth_frame = 2
n, f, read = 0, self.frames[i], skip_every_nth_frame

and then change

time.sleep(1/30)

to

time.sleep(1/(30/inference_every_nth_frame))

Let me know if this works for you!

HerneSong commented 2 years ago

Yes. Now I have stable stream with tracking 😀! And with a proper fps reduction ratio that fits my cpu, I can save correct tracking videos. Good to see that we can make it work.

mikel-brostrom commented 2 years ago

Could you provide the code changes you made @HerneSong so that this can be added to the repo? :smile: Feel free to submit a PR

ccfarah commented 2 years ago

Thanks @HerneSong and @mikel-brostrom for persevering. I was just about to report the same issue, then I found this. :)