Adding a get raw image function to camera component

rafale77 commented 4 years ago

Context

Integration with various image processing components on home assistant led me to discover a very high CPU load/usage rate as the number of cameras and image processing frequency increases for the need of HA.

I dug into the code and discovered that the frames being processed from the camera component are all first encoded in jpeg and then redecoded into either raw or a numpy array for processing. The most common image management tools in the AI/Deep learning/Machine learning field are PIL/Pillow and opencv libraries. I understand the get image function has for purpose to display the frame on the front end but this is done generally every 10s. For image processing of multiple camera streams and to have a useable integration for triggering automation the frame processing interval needs to be <1s. This causes a significant load on the system even when assisted with a GPU.

See the thread I posted on the forum:

https://community.home-assistant.io/t/image-processing-efficiency/216157

Besides, the common use of FFMpeg appears to be very inefficient as FFmpeg natively pre-reencodes frames which doubles CPU utilization.

https://community.home-assistant.io/t/re-object-detection-for-video-surveillance/213769

In this first PR: https://github.com/home-assistant/core/pull/38774 I am only proposing to add support for a get raw image function which would be written in the various camera integration entity code. The function would pass the image frame as a numpy array directly to the image processing entities without conversion to JPEG, as it is already the native format both opencv and PIL read their images, and enables the deep learning or machine learning framework integrations to process them without conversion.

I have already rewritten for example an opencv camera component to manage a stream and support both the display every 10s on lovelace, direct real time streaming on screen when the component is open, and 24/7 image processing for facial and object recognition.

Consequences

I have chosen to add a function instead of replacing one so as not to create a breaking change and enable owners of the various image processing and camera integrations to use and convert to at their convenience. The change from the FFMpeg binary to openCV for extraction of the frames to be processed has decreased my CPU utilization by 50%. The change to eliminate the JPEG encoding and decoding of every frame to be processed has decreased by CPU load by another 50%, adding up to 75% reduction in CPU load for each camera stream processed which enabled me to simultaneously support 7 camera streams at 10fps of image processing while recording <50% of CPU load while 3 cameras loaded the CPU to 120% before the change.

balloob commented 4 years ago

async_get_image already returns an image class with content type and content. I think that we can be a little smarter here.

We could either add new keyword args to the async_get_image method to say we allow_numpy, and have the source be able to return a numpy array. That way other raw formats can be added too.

rafale77 commented 4 years ago

That may work too. This was my proposal for a workaround. The issue when I tried to do this is that using the same method sending raw frames would prevent lovelace from working properly as the front end would use this method without knowing what was coming while expecting a jpeg. If lovelace and the image processing component could pass the format in the method which then would call different functions in the camera component, it would work too.

balloob commented 4 years ago

that's why we would add a new allow_numpy etc keyword args to allow non-standard image formats.

frenck commented 1 year ago

This architecture issue is old, stale, and possibly obsolete. Things changed a lot over the years. Additionally, we have been moving to discussions for these architectural discussions.

For that reason, I'm going to close this issue.

../Frenck

home-assistant / architecture

Adding a get raw image function to camera component #417

Context

Proposal

Consequences