Libcamera stack support (features, APIs and discussions)

raffaeler commented 2 years ago

It's been a while that the Raspicam camera stack has gone legacy. This means means the V4L2 drivers are not in place by default in the most recent versions of the Raspbian Operating System. In order to make our VideoDevice binding work, the legacy support must be set using raspi-config on the Raspberry PI. The complete overview on the camera support can be found here.

The new stack is called libcamera and is turned on by default in the OS. More info about libcamera:

The Raspberry foundation also published a new set of applications/tools replacing the legacy raspistill/raspivid. The source code can be found at the following link: https://github.com/raspberrypi/libcamera-apps/tree/main/apps

The libcamera documentation also talks about a compatibility layer, but it is unclear whether and how it can be activated to make applications using V4L2 work without turning on the legacy Raspicam model. The most interesting links on this subject are here:

In any case, it looks like libcamera is the future and it may be worth to build a new binding to fully exploit the potentials of the modern cameras connected to the Raspberry PI and similar devices.

Unfortunately, according the official libcamera account, they still do not provide a native C language API (but only C++), meaning that it makes far more difficult to create the interoperability code to provide a .NET binding. This would force the need to ship the binding with a certain amount of native code wrapping the official C++ interface. In turns this implies the need to ship nuget packages with different assemblies for each CPU architecture.

As a reminder, C++ does not provide a standard binary contract. This translates in the impossibility to guarantee the binary interop across different C++ compilers and often across different versions of the same C++ compiler.

I created this thread for a few reasons:

Telling the community why the current VideoDevice binding is not working (unless you manually turn on the legacy support)
Getting fresh ideas on the potential implementation of the libcamera binding (no promises)
Seeing who is interested in working/helping on this implementation
Eventually discussing the API to expose in the binding

Ellerbach commented 2 years ago

In turns this implies the need to ship nuget packages with different assemblies for each CPU architecture

I guess there is another way, the same way as it's managed for the FT devices having a check of the OS (and could be the architecture as well) to branch on the right imprt: https://github.com/dotnet/iot/blob/1cae134bbd2958bb1fa7ac608169c33afdba3533/src/devices/Ft4222/FtFunction.cs#L22

raffaeler commented 2 years ago

@Ellerbach

I guess there is another way, the same way as it's managed for the FT devices having a check of the OS (and could be the architecture as well) to branch on the right imprt:

I don't think so because they do not provide C exports but just C++ classes. This means that every time they ship (in the OS) the mangled names may change and broke our side. The ideal solution is them to ship a C wrapper so that we can bind to those with PInvokes.

They are definitely interested in any solution that may ease the adoption from the widest number of languages/platform, so there is space for that even if it may take some time (let's discuss this).

raffaeler commented 2 years ago

To clarify a bit my previous statement about nuget packages, the alternate solution is for us to provide a C wrapper in order to avoid binding with the C++ mangled names. This would move the "instability" inside the wrapper that should be eventually recompiled and shipped every time the libcamera layer is recompiled with a different version of the C++ compiler (and thus changing the ABI)

Ellerbach commented 2 years ago

Ho, got it now. For me, the way to go is to have a nice export C wrapper. Otherwise, it won't be manageable at all. And it's the reason export C has been invented anyway :-D

raffaeler commented 2 years ago

Yes, indeed. Now the problem is understanding the time+effort required to do that. They will most probably do it but it's still to be decided. On our side we have to understand if it's worth to do something filling the gap. I will ask the C++ folks if they have suggestions (I see many project leveraging the node-gyp to let the C++ compilation happen at install time).

raffaeler commented 2 years ago

Update on the compability layer

Since this thread is about libcamera, I will briefly write down the test I did about the compability layer provided by libcamera. I had a long and pleasant meeting with https://github.com/kbingham which is part of the libcamera team and he gave me a lot of suggestion about the emulation layer. BTW they are looking for feedback on this as this is not fully/deeply tested.

As a side note, Kieran wanted to underline that libcamera is a project that is not specifically tied to the Raspberry PI foundation. They instead provided a suite of apps/tools to make it easier for anyone to replace the legacy raspivid and raspistill. As you may read on the libcamera website, their project is wider and cover a broad range of Linux distributions and also include Google's Android. Take this note as my understanding and always refer to their website for official statement, please.

What is the `libcam` compatibility layer

This layer has nothing to do with the "legacy support" provided in raspi-config. It is provided by the libcamera stack to support the applications that are still using the V4L2 driver directly. In other words they implemented in the user space the APIs that are normally found inside the V4L2 driver. When the application is diverted to use their API, libcamera provide the required behavior as it was a V4L2 driver. This of course means that using the layer does not provide access to the full features provided by libcamera, therefore this cannot be a long-term solution.

There are two ways to leverage the compatibility layer.

Intercepting the library calls using `LD_PRELOAD`

This system consist in intercepting the calls to the library providing the V4L2 calls and redirecting them to the compatibility layer It works by running the application using the libcamerify tool (currently shipped in the latest Raspbian OS):

$ libcamerify myApplication

Since the dotnet/iot library uses PInvokes straight to the driver, this system cannot work.

Manually changing the calls

The current VideoDevice binding uses a few PInvokes to the Libc library. The same exact functions also exist in the v4l2-compat.so library provided by libcamera (also shipped in the OS). By replacing the library name from Libc to v4l2-compat.so our binding sample started working. I also had to add more error checking to verify which call fails. I did not finish to test the scenario, therefore it is still too early before celebrating. There are a few calls to ioctl that are failing and I am not sure whether it depends on my specific camera model or a problem with the compatibility layer.

Debugging the compatibility layer

The compatibility layer may emit a very verbose log to understand what is happening. This can be turned on by just setting an environment variable:

export LIBCAMERA_LOG_LEVELS=*:0

Wrap up

I documented how to make it work right now with the minimum amount of changes, just in case someone is willing to test it immediately (if anyone is having problems doing it, just let me know). If this compatibility layer works correctly the work to be done in the current VideoDevice is:

write the code to detect whether the libcamera or V4L2 are in place
wrap the interop layer to call the new or old set of PInvokes

CodedBeard commented 2 years ago

@raffaeler had a little test of this last night, but didn't get a lot of time as both my Linux dev box and raspberry pi had decided to break since the last time I debugged the project.

That said, I was able to follow you're suggested changes, and make calls to Capture an image that the libcamera logs suggested completed succesfully, however, all the images were mangled again.

I compared some of the returned images to one taken by libcamera-still with the same settings: test_images

Top left is the one from libcamera-still, and the other three are all from the updated code. All three of the captured images seem to follow a pattern in what is returned, and are all exactly the same size which I suspect is due to the frame using the buffer size still.

I also had a brief test with the continuous capture, and discovered that does still throw errors and causes the code to crash on most attempts with the following error: ERROR Camera camera.cpp:549 Camera in Running state trying release() requiring state between Available and Configured

Will have more of a dig this evening now my dev setup is working again.

raffaeler commented 2 years ago

Hi @CodedBeard I created a branch in my fork that you can find here: https://github.com/raffaeler/iot/tree/rafvideodevice/src/devices/Common/Interop

The Common/Interop code contains all the required declarations and also some other to get the extended string error from the ioctls that are failing.
The UnixVideoDevice class makes the detection (at runtime) for Libcamera allowing the ioctls and the other interop work on both buster (legacy stack) and bullseye (libcamera stack).

The current code prints some diagnostic on the Console (that I will remove before creating the PR) but you can use it to compare the results.

Using the official code on buster, I see two identical images (in other words, it work) because it uses the old stack. Using the same code on bullseye (libcamera) I see:

the "direct" image being corrupted
the "yuyv_to_jpg" with the wrong color format

From the console output in my branch I believe the compatibility layer from libcamera fails to set the requested image format. This may explain the output images:

the corrupted image is probably a raw/bayer image because PixelFormat.JPEG has not been set
the "yuyv_to_jpg" has been translated with the wrong format because the PixelFormat.YUV420 has not been set

If you discover anything else, please let me know. I already asked clarifications/confirmations to the libcamera developer I am in touch with.

kbingham commented 2 years ago

Jumping in here, libcamera can not provide a JPEG directly from RPi. It has to be encoded separately.

kbingham commented 2 years ago

If you can attach some of the captured images, it would be helpful to see how they look in their corrupted form.

raffaeler commented 2 years ago

Hi @kbingham,

libcamera can not provide a JPEG directly from RPi. It has to be encoded separately.

on Buster the RPi can successfully code the jpg. Do you mean that this is a limitation from libcamera?

If you can attach some of the captured images, it would be helpful to see how they look in their corrupted form

Earlier today I sent you a link to the pictures and a short explanation via email.

kbingham commented 2 years ago

It's not a limitation from libcamera, we can provide a jpeg image if the hardware provides it - but on RPi's new drivers, they don't expose a JPEG output.

But from your perspective, yes - the legacy stack could provide a jpeg directly, and non-legacy stack does not provide a jpeg on RPi.

raffaeler commented 2 years ago

@kbingham Interesting, I will look for the docs on this in order to link this to our readme.

Did you had the time to see the pictures and the text logs I sent you?

kbingham commented 2 years ago

I see the pictures, it could be an issue with getting YUV420 and YVU420 mixed up?

Your test app also seems to reference / mix YV12 ... which I am weary might reference NV12 ?

It could also be a bug in the translation in the v4l2-adatptation layer too of course.

raffaeler commented 2 years ago

I don't think so, the GetPixelFormatResolution is called just to show the resolutions and the format PixelFormat.YUYV is not "set" or used later. The second image should come with PixelFormat.YUV420 (which works on buster) but has a different format (see the failures in the text files).

If you notice, there is another important difference between buster and bullseye when querying the resolutions. Not sure if this is desired. It may be concerning for applications that relies on the previous output.

buster:

[32x32]->[3280x2464], Step [2,2]

bullseye:

[160x120]->[160x120], Step [0,0] [240x160]->[240x160], Step [0,0] [320x240]->[320x240], Step [0,0] [400x240]->[400x240], Step [0,0] [480x320]->[480x320], Step [0,0] [640x360]->[640x360], Step [0,0] [640x480]->[640x480], Step [0,0] [720x480]->[720x480], Step [0,0] [768x480]->[768x480], Step [0,0] [854x480]->[854x480], Step [0,0] [720x576]->[720x576], Step [0,0] [800x600]->[800x600], Step [0,0] [960x540]->[960x540], Step [0,0] [1024x576]->[1024x576], Step [0,0] [960x640]->[960x640], Step [0,0] [1024x600]->[1024x600], Step [0,0] [1024x768]->[1024x768], Step [0,0] [1280x720]->[1280x720], Step [0,0] [1152x864]->[1152x864], Step [0,0] [1280x800]->[1280x800], Step [0,0] [1360x768]->[1360x768], Step [0,0] [1366x768]->[1366x768], Step [0,0] [1440x900]->[1440x900], Step [0,0] [1280x1024]->[1280x1024], Step [0,0] [1536x864]->[1536x864], Step [0,0] [1280x1080]->[1280x1080], Step [0,0] [1600x900]->[1600x900], Step [0,0] [1400x1050]->[1400x1050], Step [0,0] [1680x1050]->[1680x1050], Step [0,0] [1600x1200]->[1600x1200], Step [0,0] [1920x1080]->[1920x1080], Step [0,0] [2048x1080]->[2048x1080], Step [0,0] [1920x1200]->[1920x1200], Step [0,0] [2160x1080]->[2160x1080], Step [0,0] [2048x1152]->[2048x1152], Step [0,0] [2560x1080]->[2560x1080], Step [0,0] [2048x1536]->[2048x1536], Step [0,0] [2560x1440]->[2560x1440], Step [0,0] [2560x1600]->[2560x1600], Step [0,0] [2960x1440]->[2960x1440], Step [0,0] [2560x2048]->[2560x2048], Step [0,0] [3200x1800]->[3200x1800], Step [0,0] [3200x2048]->[3200x2048], Step [0,0] [3200x2400]->[3200x2400], Step [0,0]

CodedBeard commented 2 years ago

@raffaeler So I was able to get a few good images using your branch. I was only able to make it work using YUV420 and then passing it through the Yv12ToRgb and RgbToBitmap methods, which is a shame as they are horribly slow. Any other combination of PixelFormat resulted in an unusable image, even if i tried also passing them through the conversion as well.

raffaeler commented 2 years ago

I believe there is a problem with getting/setting formats using the compat mode of libcamera

CodedBeard commented 2 years ago

Yep I noticed the ioctl failures you mentioned. So far i've run into the following ones:

GetVideoDeviceValue ioctl failed (-1) Error: 25: Inappropriate ioctl for device

CaptureContinuous ioctl failed (-1) Error: 16: Device or resource busy ERROR Camera camera.cpp:549 Camera in Running state trying configure() requiring state between Acquired and Configured

ApplyFrameBuffers ioctl failed (-1) Error: 22: Invalid argument

I did briefly have the Continuos capture 'working' but it was so slow i was measuring it in frames per min rather than second. After the first one it would always put the Pi camera in an unusable state until i restarted the device.

raffaeler commented 2 years ago

Yep I noticed the ioctl failures you mentioned.

Hopefully, as soon as the failures gets fixed, we should be able to use all the hardware formats as before (I mean excluding the ones that were emulated by the previous stack)

I did briefly have the Continuos capture 'working' but it was so slow i was measuring it in frames per min rather than second. After the first one it would always put the Pi camera in an unusable state until i restarted the device.

That's very sad, I didn't test the streaming yet. I am just trying to make this compatibility layer work, so that we can plan with more time a brand new binding that accesses the native libcamera API. This won't be easy/fast because of the issues I wrote at the beginning of this thread (the C++ ABI).

CodedBeard commented 2 years ago

I am just trying to make this compatibility layer work, so that we can plan with more time a brand new binding that accesses the native libcamera API. This won't be easy/fast because of the issues I wrote at the beginning of this thread (the C++ ABI).

Completely understand, sorry I can't be more helpful on that end as working with the IoT project has been pretty much my only exposure to interop code, so still getting my head around how it all works.

raffaeler commented 2 years ago

Completely understand, sorry I can't be more helpful on that end as working with the IoT project has been pretty much my only exposure to interop code, so still getting my head around how it all works.

No worries! Testing is a huge help on this repository. The number of variables and possible use-cases is definitely high and bug are difficult to prevent, especially when hardware devices are involved.

Thank you for your help

kbingham commented 2 years ago

Yep I noticed the ioctl failures you mentioned. So far i've run into the following ones:

GetVideoDeviceValue ioctl failed (-1) Error: 25: Inappropriate ioctl for device

Which ioctl was called?

CaptureContinuous ioctl failed (-1) Error: 16: Device or resource busy ERROR Camera camera.cpp:549 Camera in Running state trying configure() requiring state between Acquired and Configured

This means that something was trying to reconfigure the camera without calling stop / streamOff. That's simply not allowed, not even in V4L2 I would expect? - So I'm not sure where the fault for this bug lies ...

ApplyFrameBuffers ioctl failed (-1) Error: 22: Invalid argument

I need to know which ioctl this was too - can the debug prints be extended to say more?

kbingham commented 2 years ago

@raffaeler So I was able to get a few good images using your branch. I was only able to make it work using YUV420 and then passing it through the Yv12ToRgb and RgbToBitmap methods, which is a shame as they are horribly slow. Any other combination of PixelFormat resulted in an unusable image, even if i tried also passing them through the conversion as well.

Absolutely - and the Pi can output many formats, so I would believe we can do this in hardware without software conversion - so we 'just' need to identify where this configuration issue is occurring.

You can't pass arbitrary formats through a single conversion routine - each pixel format is a unique pattern which must be handled correctly.

raffaeler commented 2 years ago

@kbingham I don't know how many formats among those projected by the V4L2 legacy driver are hw or sw. Given that there are only three chipset supported by RPi, it should not be that hard identifying them, but you probably know better than us.

we 'just' need to identify where this configuration issue is occurring.

Can't you start from my logs (the one I sent you via email)? They looks like pretty much the same problems.

You can't pass arbitrary formats through a single conversion routine - each pixel format is a unique pattern which must be handled correctly.

The sample enumerates the format as a starting point. If I see a supported format in the capabilities it should work as expected. I agree that the sample blindly uses the formats without matching the list (it assumes they are supported while it is not true because of the software emulation).

It is also true that the compatibility mode cannot blindly replace the legacy support since the behavior is quite different.

With regards to the perf, you'd know better than anyone why they are so poor.

CodedBeard commented 2 years ago

Yep I noticed the ioctl failures you mentioned. So far i've run into the following ones: GetVideoDeviceValue ioctl failed (-1) Error: 25: Inappropriate ioctl for device

Which ioctl was called?

The first bunch of ioctl errors occur when loading the Video settings, all of these calls to GetVideoDeviceValue throw the same error.

CaptureContinuous ioctl failed (-1) Error: 16: Device or resource busy ERROR Camera camera.cpp:549 Camera in Running state trying configure() requiring state between Acquired and Configured

This means that something was trying to reconfigure the camera without calling stop / streamOff. That's simply not allowed, not even in V4L2 I would expect? - So I'm not sure where the fault for this bug lies ...

I think this might simply be the device being overloaded. CaptureContinuous does call streamOff, but I think the Pi may have been overloaded with the calls to Yv12ToRgb and RgbToBitmap as each frame was taking over 100ms. I wrote a quick benchmark to see just how slow those methods are, and even on windows parsing a byte[] in memory was unusably slow for streaming:

Method	Mean	Error	StdDev	Ratio	RatioSD	Gen 0	Gen 1	Gen 2	Allocated
SystemDrawing	367.8 ms	6.77 ms	5.65 ms	1.00	0.00	3000.0000	3000.0000	3000.0000	598,495,344 B

ApplyFrameBuffers ioctl failed (-1) Error: 22: Invalid argument

I need to know which ioctl this was too - can the debug prints be extended to say more?

I didn't see where this one occured as the device promptly crashed after it, but i think it was trying to unmap the memory

kbingham commented 2 years ago

I've looked through and found the following two failure points which show that there are unimplemented features in the v4l2_adaptation layer.

https://github.com/raffaeler/iot/blob/a3841d0487dca944e1d5aca63ff0acface8db7e4/src/devices/Media/VideoDevice/Devices/UnixVideoDevice.cs#L253

InteropVideodev2.V4l2Request.VIDIOC_QUERYCTRL : Reports an unsupported ioctl, so this should be extended / added to the v4l2 adaptation layer I believe.

InteropVideodev2.V4l2Request.VIDIOC_G_CTRL also reports as an unsupported ioctl, so needs extending to add the functionality.

kbingham commented 2 years ago

If you have further faults, please reference the line of code that fails as directly as possible to make it easy to identify which VIDIOC or such is the point of failure.

kbingham commented 2 years ago

When you look through the logs at how it gets configured - look for lines like:

0:03:29.199915937] [1444] DEBUG V4L2Compat v4l2_camera.cpp:128 Validated configuration is: 2560x1920-NV12

This states that the format is configured to capture as NV12 (You might be calling that YV12, I'm not sure) but that's a semi-planar format.

kbingham commented 2 years ago

https://bugs.libcamera.org/show_bug.cgi?id=133 added to highlight that this needs development.

raffaeler commented 2 years ago

InteropVideodev2.V4l2Request.VIDIOC_QUERYCTRL : Reports an unsupported ioctl, so this should be extended / added to the v4l2 adaptation layer I believe.

InteropVideodev2.V4l2Request.VIDIOC_G_CTRL also reports as an unsupported ioctl, so needs extending to add the functionality.

Yes because it is used to retrieve the available resolutions. If possible, it should return the exact same list of the V4L2 real driver because otherwise applications may behave differently (the replacement with the compatibility layer would not be transparent as everyone would expect)

If you have further faults, please reference the line of code that fails as directly as possible to make it easy to identify which VIDIOC or such is the point of failure.

Will do.

This states that the format is configured to capture as NV12 [...]

This is probably because it fails setting the required format and falls back to the default defined in V4L2

@kbingham what about the performance issues? Did you expect such poor performance? If yes, we have to forget using it for video purposes, but just for still images, which would be very sad.

(just for your reference, I have an app that streams the video over websockets, split the h264 and with the second stream provide face recognition throught opencv. The streaming + splitting portion is entirely written in C# and only the face recognition is provided by opencv. The perf with .NET are more than excellent).

kbingham commented 2 years ago

I don't know how you're measuring performances, or what you're facing. But you should certainly avoid any format conversions in software, and let the RPi use it's hardware ISP... But it seems we might have to fix the format configuration for that.

raffaeler commented 2 years ago

Since @CodedBeard already started measuring, I would ask him to make this other test, if he find some time (thank you). Hint: it should be sufficient to use one of the video formats returned from the capabilities.

kbingham commented 2 years ago

Well rather than "one of the others" I think we need to determine what is required. Are you only encoding to jpeg? What's the bigger picture? What formats does your encoder accept?

raffaeler commented 2 years ago

Well rather than "one of the others" I think we need to determine what is required. Are you only encoding to jpeg? What's the bigger picture? What formats does your encoder accept?

This repo is about allowing others to use the widest possible number of devices without having to care about the interoperability, protocols, etc. We call these "bindings" and their goal is to expose the widest possible number of available capabilities so that anyone can fully leverage the power of the device in their own app, just using C#, F#, VB.NET or any other .NET language.

The sample we are using to make tests is just a sample. I can't really say what the community members are using. If the capabilities API returns a list of formats, I expect to be able to use all of them.

I don't expect this to be a problem for the compatibility layer. I expect a different API, but libcamera should allow to do pretty much the same thing: query the capabilities and use one of them to retrieve the data in that format. If this is true, it should be also possible to obtain the same functionality from the compatibility layer. Am I missing something?

CodedBeard commented 2 years ago

I've been testing with the included MJPEG sample server. With V4L2, the image is captured as Jpeg directly and then written to the response stream. Running the sample as is with the compat layer, results in the previously mentioned errors about Jpeg being unsupported:

[90:09:10.544994677] [351202]  WARN V4L2 v4l2_pixelformat.cpp:287 Unsupported V4L2 pixel format JPEG
[90:09:10.545516413] [351202]  WARN Formats formats.cpp:928 Unsupported pixel format 0x0x00000000
[90:09:10.545542801] [351202]  WARN Formats formats.cpp:928 Unsupported pixel format 0x0x00000000
[90:09:10.545634652] [351202]  WARN Formats formats.cpp:928 Unsupported pixel format 0x0x00000000
[90:09:10.546052778] [351202]  INFO Camera camera.cpp:1029 configuring streams: (0) 3200x2400-NV12
[90:09:10.546583625] [351206]  INFO RPI raspberrypi.cpp:760 Sensor: /base/soc/i2c0mux/i2c@1/imx219@10 - Selected sensor format: 3280x2464-SBGGR10_1X10 - Selected unicam format: 3280x2464-pBAA
[90:09:10.554889564] [351202]  INFO Camera camera.cpp:1029 configuring streams: (0) 3200x2400-NV12
[90:09:10.555517799] [351206]  INFO RPI raspberrypi.cpp:760 Sensor: /base/soc/i2c0mux/i2c@1/imx219@10 - Selected sensor format: 3280x2464-SBGGR10_1X10 - Selected unicam format: 3280x2464-pBAA

If I change the pixel format to one that is listed as supported, the errors go away but the device never seems to actually change the format, which i guess is expected from the ioctl errors:

settings = new VideoConnectionSettings(0, (3200,2400), PixelFormat.YUV420);
[89:32:42.965781712] [348713]  INFO Camera camera.cpp:1029 configuring streams: (0) 3200x2400-YUV420
[89:32:42.966186263] [348728]  INFO RPI raspberrypi.cpp:760 Sensor: /base/soc/i2c0mux/i2c@1/imx219@10 - Selected sensor format: 3280x2464-SBGGR10_1X10 - Selected unicam format: 3280x2464-pBAA
[89:32:42.974469605] [348713]  INFO Camera camera.cpp:1029 configuring streams: (0) 3200x2400-YUV420
[89:32:42.974867730] [348728]  INFO RPI raspberrypi.cpp:760 Sensor: /base/soc/i2c0mux/i2c@1/imx219@10 - Selected sensor format: 3280x2464-SBGGR10_1X10 - Selected unicam format: 3280x2464-pBAA

settings = new VideoConnectionSettings(0, (3200,2400), PixelFormat.NV12);
[89:52:34.607950666] [350005]  INFO Camera camera.cpp:1029 configuring streams: (0) 3200x2400-NV12
[89:52:34.608385829] [350010]  INFO RPI raspberrypi.cpp:760 Sensor: /base/soc/i2c0mux/i2c@1/imx219@10 - Selected sensor format: 3280x2464-SBGGR10_1X10 - Selected unicam format: 3280x2464-pBAA
[89:52:34.616848016] [350005]  INFO Camera camera.cpp:1029 configuring streams: (0) 3200x2400-NV12
[89:52:34.617260697] [350010]  INFO RPI raspberrypi.cpp:760 Sensor: /base/soc/i2c0mux/i2c@1/imx219@10 - Selected sensor format: 3280x2464-SBGGR10_1X10 - Selected unicam format: 3280x2464-pBAA

settings = new VideoConnectionSettings(0, (3200,2400), PixelFormat.RGB24);
[89:54:57.582318165] [350259]  INFO Camera camera.cpp:1029 configuring streams: (0) 3200x2400-BGR888
[89:54:57.582911733] [350266]  INFO RPI raspberrypi.cpp:760 Sensor: /base/soc/i2c0mux/i2c@1/imx219@10 - Selected sensor format: 3280x2464-SBGGR10_1X10 - Selected unicam format: 3280x2464-pBAA
[89:54:57.591319251] [350259]  INFO Camera camera.cpp:1029 configuring streams: (0) 3200x2400-BGR888
[89:54:57.591931357] [350266]  INFO RPI raspberrypi.cpp:760 Sensor: /base/soc/i2c0mux/i2c@1/imx219@10 - Selected sensor format: 3280x2464-SBGGR10_1X10 - Selected unicam format: 3280x2464-pBAA

As the sample expects to returns Jpegs, I then had to update the reponse writing as so:

using var stream = new MemoryStream(args.ImageBuffer);
Color[] colors = VideoDevice.Yv12ToRgb(stream, (3200,2400));
Bitmap bitmap = VideoDevice.RgbToBitmap((3200,2400), colors);
using var returnStream = new MemoryStream();
bitmap.Save(returnStream, System.Drawing.Imaging.ImageFormat.Jpeg);
await HttpContext.Response.BodyWriter.WriteAsync(CreateHeader(args.ImageBuffer.Length));
await HttpContext.Response.BodyWriter.WriteAsync(returnStream.ToArray());
await HttpContext.Response.BodyWriter.WriteAsync(footerBytes);

Using the YUV420 format, this 'works' but as expected due to the format conversion is really slow.

The benchmark I pasted earlier was only including the Yv12ToRgb. If Include the full conversion to Jpeg, its even worse:

[GlobalSetup]
public void Setup()
{
    bytes = System.IO.File.ReadAllBytes("C:\\temp\\yuv420_direct_output_2022-06-16-19-50-23.jpg");
}

[Benchmark(Baseline = true)]
public void ConvertToJpeg_SystemDrawing()
{
    VideoConnectionSettings settings = new(0, (3200, 2400), PixelFormat.YUV420);
    using var stream = new MemoryStream(bytes);
    var colours = VideoDevice.Yv12ToRgb(stream, settings.CaptureSize);
    var bitmap = VideoDevice.RgbToBitmap(settings.CaptureSize, colours);
    using var result = new MemoryStream();
    bitmap.Save(result, System.Drawing.Imaging.ImageFormat.Jpeg);
}

Method	Mean	Error	StdDev	Ratio	Gen 0	Gen 1	Gen 2	Allocated
ConvertToJpeg_SystemDrawing	4.320 s	0.0720 s	0.0674 s	1.00	3000.0000	3000.0000	3000.0000	573 MB

Changing the initial image format and then using the corresponding converter makes little difference, the average conversion time for all of them is over 4 seconds, so clearly not usable for live video. I think it also explains why my device crashed, as it almost certainly ran out of memory if each frame needs over 500MB for conversion.

kbingham commented 2 years ago

I've been testing with the included MJPEG sample server. With V4L2, the image is captured as Jpeg directly and then written to the response stream. Running the sample as is with the compat layer, results in the previously mentioned errors about Jpeg being unsupported:

As far as I understand it, that is correct - there is no JPEG support from the Camera directly on the RPi in this implementation (as in from Raspberry Pi's ISP implementation).

Did the supported formats report that JPEG was supported? That would be a bug if it was reported as supported but isn't.

[90:09:10.544994677] [351202]  WARN V4L2 v4l2_pixelformat.cpp:287 Unsupported V4L2 pixel format JPEG
[90:09:10.545516413] [351202]  WARN Formats formats.cpp:928 Unsupported pixel format 0x0x00000000
[90:09:10.545542801] [351202]  WARN Formats formats.cpp:928 Unsupported pixel format 0x0x00000000
[90:09:10.545634652] [351202]  WARN Formats formats.cpp:928 Unsupported pixel format 0x0x00000000
[90:09:10.546052778] [351202]  INFO Camera camera.cpp:1029 configuring streams: (0) 3200x2400-NV12
[90:09:10.546583625] [351206]  INFO RPI raspberrypi.cpp:760 Sensor: /base/soc/i2c0mux/i2c@1/imx219@10 - Selected sensor format: 3280x2464-SBGGR10_1X10 - Selected unicam format: 3280x2464-pBAA
[90:09:10.554889564] [351202]  INFO Camera camera.cpp:1029 configuring streams: (0) 3200x2400-NV12
[90:09:10.555517799] [351206]  INFO RPI raspberrypi.cpp:760 Sensor: /base/soc/i2c0mux/i2c@1/imx219@10 - Selected sensor format: 3280x2464-SBGGR10_1X10 - Selected unicam format: 3280x2464-pBAA

If I change the pixel format to one that is listed as supported, the errors go away but the device never seems to actually change the format, which i guess is expected from the ioctl errors:

I'm a little confused - the text below reads as if the formats /are/ being set successfully to the formats you have selected when they are from the list of supported formats?

settings = new VideoConnectionSettings(0, (3200,2400), PixelFormat.YUV420);
[89:32:42.965781712] [348713]  INFO Camera camera.cpp:1029 configuring streams: (0) 3200x2400-YUV420
[89:32:42.966186263] [348728]  INFO RPI raspberrypi.cpp:760 Sensor: /base/soc/i2c0mux/i2c@1/imx219@10 - Selected sensor format: 3280x2464-SBGGR10_1X10 - Selected unicam format: 3280x2464-pBAA
[89:32:42.974469605] [348713]  INFO Camera camera.cpp:1029 configuring streams: (0) 3200x2400-YUV420
[89:32:42.974867730] [348728]  INFO RPI raspberrypi.cpp:760 Sensor: /base/soc/i2c0mux/i2c@1/imx219@10 - Selected sensor format: 3280x2464-SBGGR10_1X10 - Selected unicam format: 3280x2464-pBAA

settings = new VideoConnectionSettings(0, (3200,2400), PixelFormat.NV12);
[89:52:34.607950666] [350005]  INFO Camera camera.cpp:1029 configuring streams: (0) 3200x2400-NV12
[89:52:34.608385829] [350010]  INFO RPI raspberrypi.cpp:760 Sensor: /base/soc/i2c0mux/i2c@1/imx219@10 - Selected sensor format: 3280x2464-SBGGR10_1X10 - Selected unicam format: 3280x2464-pBAA
[89:52:34.616848016] [350005]  INFO Camera camera.cpp:1029 configuring streams: (0) 3200x2400-NV12
[89:52:34.617260697] [350010]  INFO RPI raspberrypi.cpp:760 Sensor: /base/soc/i2c0mux/i2c@1/imx219@10 - Selected sensor format: 3280x2464-SBGGR10_1X10 - Selected unicam format: 3280x2464-pBAA

settings = new VideoConnectionSettings(0, (3200,2400), PixelFormat.RGB24);
[89:54:57.582318165] [350259]  INFO Camera camera.cpp:1029 configuring streams: (0) 3200x2400-BGR888
[89:54:57.582911733] [350266]  INFO RPI raspberrypi.cpp:760 Sensor: /base/soc/i2c0mux/i2c@1/imx219@10 - Selected sensor format: 3280x2464-SBGGR10_1X10 - Selected unicam format: 3280x2464-pBAA
[89:54:57.591319251] [350259]  INFO Camera camera.cpp:1029 configuring streams: (0) 3200x2400-BGR888
[89:54:57.591931357] [350266]  INFO RPI raspberrypi.cpp:760 Sensor: /base/soc/i2c0mux/i2c@1/imx219@10 - Selected sensor format: 3280x2464-SBGGR10_1X10 - Selected unicam format: 3280x2464-pBAA

As the sample expects to returns Jpegs, I then had to update the reponse writing as so:

If the sample is expecting JPEG then it certainly needs to be encoded to a JPEG. What format is most efficient for your JPEG encoder to consume?

using var stream = new MemoryStream(args.ImageBuffer);
Color[] colors = VideoDevice.Yv12ToRgb(stream, (3200,2400));
Bitmap bitmap = VideoDevice.RgbToBitmap((3200,2400), colors);
using var returnStream = new MemoryStream();
bitmap.Save(returnStream, System.Drawing.Imaging.ImageFormat.Jpeg);
await HttpContext.Response.BodyWriter.WriteAsync(CreateHeader(args.ImageBuffer.Length));
await HttpContext.Response.BodyWriter.WriteAsync(returnStream.ToArray());
await HttpContext.Response.BodyWriter.WriteAsync(footerBytes);

Using the YUV420 format, this 'works' but as expected due to the format conversion is really slow.

Ok - so YV12 is Pixelformat.YUV420.

Can your JPEG encoder support that directly without converting to RGB first? JPEG is a YUV based image compression - so you're converting YUV to RGB, so that the encoder will then convert it back to a YUV form, before or as part of compressing. All in software I expect...

The benchmark I pasted earlier was only including the Yv12ToRgb. If Include the full conversion to Jpeg, its even worse:
[GlobalSetup]
public void Setup()
{
    bytes = System.IO.File.ReadAllBytes("C:\\temp\\yuv420_direct_output_2022-06-16-19-50-23.jpg");
}

[Benchmark(Baseline = true)]
public void ConvertToJpeg_SystemDrawing()
{
    VideoConnectionSettings settings = new(0, (3200, 2400), PixelFormat.YUV420);
    using var stream = new MemoryStream(bytes);
    var colours = VideoDevice.Yv12ToRgb(stream, settings.CaptureSize);
    var bitmap = VideoDevice.RgbToBitmap(settings.CaptureSize, colours);
    using var result = new MemoryStream();
    bitmap.Save(result, System.Drawing.Imaging.ImageFormat.Jpeg);
}
Method Mean Error StdDev Ratio Gen 0 Gen 1 Gen 2 Allocated ConvertToJpeg_SystemDrawing 4.320 s 0.0720 s 0.0674 s 1.00 3000.0000 3000.0000 3000.0000 573 MB

Changing the initial image format and then using the corresponding converter makes little difference, the average conversion time for all of them is over 4 seconds, so clearly not usable for live video. I think it also explains why my device crashed, as it almost certainly ran out of memory if each frame needs over 500MB for conversion.

500MB for converting a single frame sounds incredibly excessive indeed. Are you really intending to do video at 3200, 2400 ?

Have you tried a smaller image size? 3200 x 2400 is really big for video - even more so / especially if you're software encoding it.

kbingham commented 2 years ago

Well rather than "one of the others" I think we need to determine what is required. Are you only encoding to jpeg? What's the bigger picture? What formats does your encoder accept?

This repo is about allowing others to use the widest possible number of devices without having to care about the interoperability, protocols, etc. We call these "bindings" and their goal is to expose the widest possible number of available capabilities so that anyone can fully leverage the power of the device in their own app, just using C#, F#, VB.NET or any other .NET language.

What happens if you have a camera that can only output YUV420. Do you have other components to handle encoding ?

The sample we are using to make tests is just a sample. I can't really say what the community members are using. If the capabilities API returns a list of formats, I expect to be able to use all of them.

As I understand it - the compatibility layer is not returning formats that it can't support ? (Or did I miss something there?)

I don't expect this to be a problem for the compatibility layer. I expect a different API, but libcamera should allow to do pretty much the same thing: query the capabilities and use one of them to retrieve the data in that format. If this is true, it should be also possible to obtain the same functionality from the compatibility layer. Am I missing something?

Yes, libcamera should only report pixel formats that are supported by the stream, and then of course the v4l2-adapatation layer should only report those formats. If you've seen a mis-match - please help me highlight it.

devdotnetorg commented 2 years ago

How about these projects? V4L2.NET

raffaeler commented 2 years ago

How about these projects? V4L2.NET

@devdotnetorg They cannnot work anymore (with the default settings) from the bullseye release on because the Raspbian OS now ships with libcamera stack turned on by default.

Any project previously using V4L2 ioctls is affected by their change unless you reconfigure the old stack (on the RPi, using raspi-config you can turn on the legacy stack with raspivid, raspistill and friends.

raffaeler commented 2 years ago

@kbingham The issue you are seeing with the JPEG is because the V4L2 Raspberry PI driver does this internally. I don't know who the sample author was, but I confirm that it is a bug using a format that does not come from the capabilities exposed by the driver.

What happens if you have a camera that can only output YUV420. Do you have other components to handle encoding ?

Yes, the second image does that. The YUV420 is supported and used to acquire the still image. But when the image is saved to JPEG the colors are wrong. I guess it is because the compatibility layer does not obey to the request to set the YUV420.

As I understand it - the compatibility layer is not returning formats that it can't support ? (Or did I miss something there?)

Yes, I see the request for getting that list failing.

Yes, libcamera should only report pixel formats that are supported by the stream, and then of course the v4l2-adapatation layer should only report those formats. If you've seen a mis-match - please help me highlight it.

Agree, but this opens up the problem of H.264 available on the RPi hardware and not being supported by libcamera. This is a huge problem that makes the streaming unpractical to use. You should be talking with the Raspberry Foundation to understand how they want to expose this capability when the libcamera stack is on. Frankly I don't see how else it could be done as transitioning to user-mode the stream and back to kernel-mode for encoding to H.264 would be a performance loss.

kbingham commented 2 years ago

libcamera is not an encoding library. Where should we draw the line? Should it support MPEG2, MPEG4? H265? Should the "camera" support encoding specific controls like bitrate? Gstreamer already supports all that and is a full framework for all such things like that.

H264 is supported on RPi, through a dedicated encoder API which is supported by the V4L2 community.

RPi provide example code on how to use it, and sample applications to do so with their libcamera-apps.

There is no performance penalty from copying frames from user-mode to kernel mode for encoding as they use zero-copy DMA buf handles.

kbingham commented 2 years ago

As I understand it - the compatibility layer is not returning formats that it can't support ? (Or did I miss something there?)

Yes, I see the request for getting that list failing.

Could you highlight where in your code this is failing please? The previous failures I traced were for controls - not formats.

raffaeler commented 2 years ago

During the last triage meeting we consolidated the issues related to our VideoDevice binding to this one.

I updated my comment to expose the short/medium/large strategies to make this binding work.

With regards to the compatibility layer, we have to wait for the bug fixes from the libcamera team before taking a decision (which also depends on the performance).

kbingham commented 2 years ago

@raffaeler Please note, I'm waiting on a response from you in https://github.com/dotnet/iot/issues/1875#issuecomment-1162895669 before I continue investigation into the issues for bug fixes, or a more confirmed bug report at bugs.libcamera.org.

raffaeler commented 2 years ago

Hi @kbingham in the logs I sent you via email there are the line numbers pointing at the following source code lines:

VideoDevice, 253 ==> ThrowIfFailed(V4l2Struct(InteropVideodev2.V4l2Request.VIDIOC_QUERYCTRL, ref query));
VideoDevice, 260 ==> ThrowIfFailed(V4l2Struct(InteropVideodev2.V4l2Request.VIDIOC_G_CTRL, ref ctrl));

The implementation is the one I revised in my own branch: https://github.com/raffaeler/iot/blob/rafvideodevice/src/devices/Media/VideoDevice/Devices/UnixVideoDevice.cs

The error message is: Error 25: Inappropriate ioctl for device. Please note that these calls are done at preparation time, therefore I can't know if there others that will fail once these two gets fixed.

In addition to that, in the other file there are the debug logs traced at thte same time of the failed run. With regards to the performance, we will have to evaluate them as well, as soon as the compatibility layer starts acquiring stills/videos with the correct format.

A note for other readers that are following this thread:

We will never have full transparency with the old V4L2 because the old stack provided by the Raspberry PI implementation included the compression (JPEG and H.264 hardware compression) functionalities from the V4L2 driver
The new libcamera stack (including the compatibility layer) does not enter this business and let the end-users the job to compress the native format provided by the camera (typically YUV) in whatever format they want
The hardware compression in the Raspberry PI is also accessible through a different driver. I will take care in the future in writing the appropriate library to access this functionality from .NET

@kbingham please let me know if you need further info at this time. I believe it is time to organize a meeting with the other triage members to understand the feasibility of the long-term option.

HTH

kbingham commented 2 years ago

Yes, those are for controls - not formats. You stated there were failures setting and configuring formats. Those are different to controls that you have shown as line 253 and line 260.

Is there any way I can easily reproduce this on an RPi (I know nothing about dotnet)

raffaeler commented 2 years ago

Yes, those are for controls - not formats.

Yes but @CodedBeard run other tests that I didn't have the time to do. If I remember well, he verified also that changing the formats didn't work. I am also not sure the controls failure may affect the following ioctls.

Is there any way I can easily reproduce this on an RPi (I know nothing about dotnet)

Basically you would need to:

on your dev machine install the .NET 6 sdk
fork my branch on your dev machine
on the RPi install the runtime OR deploy a "self-contained" version of the sample app. The second option is easier

The command line to cross compile the sample on Linux for the Raspberry PI is:

cd CLONED_DIR/src/devices/Media/VideoDevice/samples
dotnet publish -r linux-arm --self-contained True -c Release -p:PublishSingleFile=true -p:GenerateRuntimeConfigurationFiles=true

This command line generate a single file with the .NET runtime and all the other dependencies as a single file. You just copy it to the RPi and it should run.

The .NET runtime/sdk/etc. can be found here: https://dotnet.microsoft.com/en-us/download/dotnet/6.0 In the list you will find all the OS/Platform/Bitness supported. If you go with the self-contained option, you should not need to install anything on the RPi.

I hope not to forget anything. In case, ping me.

raffaeler commented 11 months ago

My new https://github.com/dotnet/iot/pull/2150 and will soon be packaged in version 3.1 as soon as the milestone is reached: https://github.com/dotnet/iot/milestone/7

I invite all the people in this thread to try it and tell whether we can obsolete the VideoDevice binding or not. I am not sure if it is possible to create a "fake" VideoDevice class that redirects all the calls to the new binding. At a first glance it doesn't fully fit.

I am closing this issue but we can re-open it (or a new one) whenever the Libcamera ABI will be stable so that we can evaluate the possibility to create a wrapper using PInvoke .

dotnet / iot