OpenFTC / EasyOpenCV

Finally, a straightforward and easy way to use OpenCV on an FTC robot!
219 stars 100 forks source link

What is the bottleneck resulting in FPS Limitations when streaming and is there a solution? #46

Closed frankm24 closed 1 year ago

frankm24 commented 2 years ago

Hi! I'd like to start off by saying I've loved using EasyOpenCV in FTC and I appreciate the work that you did to make the experience of creating computer vision pipelines on our robots so convenient and easy. This post is really more of a question, but I did not know of any better way to contact the developer of EOCV, so I am posting this as an issue.

I ran some benchmarks on one of my team's REV Control Hubs using a pipeline that simply returns the reference to the input frame, and these are the results:

Camera: Logitech C270 1280x720: 7.5 FPS (Theoretical max. 55) 640x360: 30 FPS (Theoretical max. 200)

I was at first confused about this, but then looked through the EOCV examples and read this quote:

"Keep in mind that the SDK's UVC driver (what OpenCvWebcam uses under the hood) only supports streaming from the webcam in the uncompressed YUV image format. This means that the maximum resolution you can stream at and still get up to 30FPS is 480p (640x480). Streaming at e.g. 720p will limit you to up to 10FPS and so on and so forth."

Based on this quote and some further research, I thought that streaming in uncompressed YUV format results in a non-CPU bottleneck which limits the FPS of the pipeline. My knowledge of this subject is limited and my intuition about the Control Hub's processing power is not great, but I suspected that this was due to a USB bandwidth limitation. I would appreciate it if you could confirm this or provide an alternative explanation.

If the camera stream is limited by communication bandwidth, I was wondering if you knew of any way to get the frames in a compressed format that would not cause such a bottleneck. I have been unable to find anything that would work on the Control Hub so far but I will keep searching.

I understand that finding a solution to this problem may be far too difficult and not worth the time investment considering that cameras are used exclusively to solve problems when having high FPS and resolution is not necessary (determining the randomized game scenario in the init phase through location of an object on the field).

I just didn't want to accept that this is the best that can be done in terms of performance because I wanted to experiment more with computer vision in the context of FTC and beyond using our robot hardware. I wanted to see if I could do something more than just a simple pipeline that does color filtering and averaging over the binary activations in submats of the filtered image. I find vision-based perception to be one of the most interesting aspects of robotics. Thank you for your time.

Frank Bad News Bots #7584

Windwoes commented 2 years ago

Hi - glad to hear you've found EOCV useful :)

Yes, the limitation with webcam FPS is the 480mbps USB 2.0 bus combined with uncompressed streaming.

Internal phone cameras do not have that issue since they are connected through a high speed interconnect. You can do 1080p30 on those without issue.

I experimented with trying to add support to the UVC driver for streaming in MJPEG compressed format, however then the bottleneck switched from being bandwidth based to being CPU based and the FPS wasn't really all that much better. I may have been compiling libjpegturbo incorrectly without NEON SIMD enabled, though... I'm not completely sure. There may also be a hardware accelerated way to do the decode.

One workaround I can think of would be to see if there are any USB 3.0 cameras you could use. Then you have 5Gbps (IIRC) to work with which should allow steaming uncompressed high resolution at full frame rate.

frankm24 commented 2 years ago

Thank you for the reply! According to this document the Control Hub uses the Rockchip RK3328 processor. The diagram on this article states that this SoC has a hardware video encoder and decoder. The decoder supports 4K 60FPS as the chip was designed to work well in TV systems so in theory it should work well, perhaps with multiple low-res streams simultaneously. I am unsure of how to actually take advantage of this hardware, though. I will keep researching this and update if I make further progress.

NoahAndrews commented 2 years ago

The Features list on that document doesn't mention MJPEG support though, so I don't think the built-in video decoders will help you there.

However, it appears that some older Logitech webcams do support sending the data as h.264, which the video decoder does support (though of course you'd have to figure out how to use it). https://www.logitech.com/en-us/video-collaboration/resources/think-tank/articles/article-logitech-and-h264-encoding.html

Windwoes commented 2 years ago

IIRC the older C920s support H.264 but the new ones don't. Getting hardware H.264 decoding to work would probably be an adventure to say the least 😂 But if you feel so inclined, ExtractedRC has full source code of the SDK, including the C++ UVC driver, so there should be nothing stopping you from experimenting to your heart's content.

NoahAndrews commented 2 years ago

IIRC the older C920s support H.264 but the new ones don't. Getting hardware H.264 decoding to work would probably be an adventure to say the least 😂 But if you feel so inclined, ExtractedRC has full source code of the SDK, including the C++ UVC driver, so there should be nothing stopping you from experimenting to your heart's content.

Yeah, that's what that Logitech blog post said.

Some cursory research brought up rockchip's MPP library, which looks to be the intended way to use the decoder: https://github.com/rockchip-linux/mpp

The readme links to C++ example code for Linux and Java example code for Android.

NoahAndrews commented 2 years ago

MPP also looks to have MJPEG support, which would be much more broadly useful. However, that might just be for other supported chips (or maybe the RK3328 really does have an MJPEG decoder).

NoahAndrews commented 2 years ago

Here's a mailing list post referring to the RK3328 as having support for decoding (M)JPEG video: https://lwn.net/Articles/776082/

Windwoes commented 1 year ago

As of v1.7.0, MJPEG streaming for webcams is now supported - see readme entry. Note that it will require the upcoming FTC SDK v8.2

The current version of libjpeg-turbo, when properly ensured to be compiled with SIMD acceleration, is fast enough that it warrants inclusion.