Since 6.6.31 Kernel: FramebufferAllocator allocates too much space

2konrad commented 1 month ago

FrameBufferAllocator::allocate allocates too much space.

The buffers size is refecting the width of the stream default eg 1980 but not the configured width eg 1280.

This behavior started after this commit in the 6.6.31 Kernel

Also this command is failing on my imx708 cam -c 1 --stream colorspace=YUV,pixelformat=BGR888,height=640,width=960 --capture=5 -Fcam.ppm Can't allocate buffers

naushir commented 1 month ago

Adding @davidplowman

Thank you for reporting this. It could indeed be something in the pipeline handler not checking things correctly. A few questions to try and narrow things down:

1) Can you provide the exact command line and/or script used to reproduce this? 2) How are you checking the allocated buffer size? 3) Does his happen in both rpicam-apps and cam?

Thanks again for your help!

2konrad commented 1 month ago

Hello Naushir, it is happening only when FrameBufferAllocator::allocate is used. Since this concept is not used in rpicam-apps, i could not observe the behavior there. Only cam is using is.

The Command line is cam -c 1 --stream colorspace=YUV,pixelformat=BGR888,height=640,width=960 --capture=5 -Fcam.ppm

There are two problems: First the fmt->bytesperline_align contains 64, where i would expect it to have 8 on a 64bit machine. second the max() statement does not allow the value to get smaller when there are multimple calls to this function.

I changed the code to see whats going on:

                        v4l2_err(&dev->v4l2_dev,
             "%s: 1: %u, 2: %u, 3: %u, 4: %u, 5: %u", 
            __func__,
            f->fmt.pix.bytesperline ,
            get_bytesperline(f->fmt.pix.width, fmt),
            f->fmt.pix.bytesperline,
            fmt->bytesperline_align,
            ALIGN(f->fmt.pix.bytesperline,fmt->bytesperline_align)
             );
        f->fmt.pix.bytesperline = max(get_bytesperline(f->fmt.pix.width, fmt),
                          ALIGN(f->fmt.pix.bytesperline,
                            fmt->bytesperline_align));
        f->fmt.pix.bytesperline = get_bytesperline(f->fmt.pix.width, fmt);

Output from my program (motionplus) - when i request the width = 1280

[   13.954855] bcm2835-isp bcm2835-isp: bcm2835_isp_node_try_fmt: 1: 0, 2: 1920, 3: 0, 4: 64
[   13.954887] bcm2835-isp bcm2835-isp: bcm2835_isp_node_try_fmt: 1: 0, 2: 1920, 3: 0, 4: 64
[   13.954987] bcm2835-isp bcm2835-isp: bcm2835_isp_node_try_fmt: 1: 1920, 2: 1280, 3: 1920, 4: 64
[   13.955001] bcm2835-isp bcm2835-isp: bcm2835_isp_node_try_fmt: 1: 1920, 2: 1280, 3: 1920, 4: 64
[   13.955160] bcm2835-isp bcm2835-isp: bcm2835_isp_node_try_fmt: 1: 1280, 2: 1280, 3: 1280, 4: 64
[   13.955177] bcm2835-isp bcm2835-isp: bcm2835_isp_node_try_fmt: 1: 1280, 2: 1280, 3: 1280, 4: 64
[   13.955287] bcm2835-isp bcm2835-isp: bcm2835_isp_node_try_fmt: 1: 1280, 2: 1280, 3: 1280, 4: 64
[   13.955305] bcm2835-isp bcm2835-isp: bcm2835_isp_node_try_fmt: 1: 1280, 2: 1280, 3: 1280, 4: 64
[   13.955369] bcm2835-isp bcm2835-isp: bcm2835_isp_node_try_fmt: 1: 1280, 2: 1280, 3: 1280, 4: 64
[   13.955381] bcm2835-isp bcm2835-isp: bcm2835_isp_node_try_fmt: 1: 1280, 2: 1280, 3: 1280, 4: 64
[   13.955630] bcm2835-isp bcm2835-isp: bcm2835_isp_node_try_fmt: 1: 2880, 2: 2880, 3: 2880, 4: 32
[   13.957563] bcm2835-isp bcm2835-isp: bcm2835_isp_node_try_fmt: 1: 1280, 2: 1280, 3: 1280, 4: 64
[   13.957733] bcm2835-isp bcm2835-isp: bcm2835_isp_node_try_fmt: 1: 1280, 2: 1280, 3: 1280, 4: 64

Output from cam statement as above

[   48.844386] bcm2835-isp bcm2835-isp: bcm2835_isp_node_try_fmt: 1: 0, 2: 512, 3: 0, 4: 64
[  400.411733] bcm2835-isp bcm2835-isp: bcm2835_isp_node_try_fmt: 1: 0, 2: 3200, 3: 0, 4: 64
[  400.411841] bcm2835-isp bcm2835-isp: bcm2835_isp_node_try_fmt: 1: 3200, 2: 2880, 3: 3200, 4: 32
[  400.411935] bcm2835-isp bcm2835-isp: bcm2835_isp_node_try_fmt: 1: 2880, 2: 2880, 3: 2880, 4: 32
[  400.412120] bcm2835-isp bcm2835-isp: bcm2835_isp_node_try_fmt: 1: 1920, 2: 1920, 3: 1920, 4: 32
[  400.412339] bcm2835-isp bcm2835-isp: bcm2835_isp_node_try_fmt: 1: 2880, 2: 2880, 3: 2880, 4: 32

in the last line of the code I undo the new assignment. you can see from the variables, that it would have gone wrong otherwise.

naushir commented 1 month ago

So to summarise what you see as the problem (in the cam) app -

[  400.411841] bcm2835-isp bcm2835-isp: bcm2835_isp_node_try_fmt: 1: 3200, 2: 2880, 3: 3200, 4: 32

request an RGB24 buffer with a 960 pixel width
the kernel driver correctly calculates the stride as 2880 (960 * 3 aligned to 32 bytes)
libcamera has requested a larger stride of 3200, which overrides what the driver calculates, and gives you a larger buffer than needed.

Is this summary correct? Sorry, I just want to make sure I completely understand what's happening.

davidplowman commented 1 month ago

I wonder if it's worth trying a Python script to reproduce this. It's still not quite clear to me if the configuration is wrong at the end or not! May something like this?

from picamera2 import Picamera2
from picamera2.allocators import LibcameraAllocator

picam2 = Picamera2()
picam2.allocator = LibcameraAllocator(picam2.camera)
main = {'size': (960, 640), 'format': 'RGB888'}
config = picam2.create_preview_configuration(main)
picam2.configure(config)
print(picam2.camera_configuration()['main'])

I believe that should be going through that allocator (though worth checking). Does that have the right stride at the end?

2konrad commented 1 month ago

i will try the Python script.

Naushir, i think you described correctly. In the meantime i checked the code again, and understood the stride calculation. Having 64 in my case (using YUV420) is not wrong. So the stride thing is not wrong. It is more about the max () statement.

2konrad commented 1 month ago

python with max statement disabled: {'format': 'RGB888', 'size': (960, 640), 'stride': 2880, 'framesize': 1843200} python with max() enabled {'format': 'RGB888', 'size': (960, 640), 'stride': 2880, 'framesize': 1843200}

so both correct

but still

pi@4:~ $ cam -c 1 --stream colorspace=YUV,pixelformat=BGR888,height=640,width=960 --capture=5 -Fcam.ppm
[0:11:29.723195959] [2124] ERROR IPAModule ipa_module.cpp:172 Symbol ipaModuleInfo not found
[0:11:29.723370594] [2124] ERROR IPAModule ipa_module.cpp:292 v4l2-compat.so: IPA module has no valid info
[0:11:29.723498525] [2124]  INFO Camera camera_manager.cpp:284 libcamera v0.2.0+120-eb00c13d
[0:11:29.848811088] [2127]  WARN RPiSdn sdn.cpp:40 Using legacy SDN tuning - please consider moving SDN inside rpi.denoise
[0:11:29.850914286] [2127]  INFO RPI vc4.cpp:446 Registered camera /base/soc/i2c0mux/i2c@1/imx708@1a to Unicam device /dev/media4 and ISP device /dev/media0
Camera configuration adjusted
Using camera /base/soc/i2c0mux/i2c@1/imx708@1a as cam0
[0:11:29.851556251] [2124]  INFO Camera camera.cpp:1183 configuring streams: (0) 960x640-BGR888
[0:11:29.851907503] [2127]  INFO RPI vc4.cpp:621 Sensor: /base/soc/i2c0mux/i2c@1/imx708@1a - Selected sensor format: 1536x864-SBGGR10_1X10 - Selected unicam format: 1536x864-pBAA
[0:11:29.854190577] [2127] ERROR V4L2 v4l2_videodevice.cpp:1241 /dev/video14[20:cap]: Unable to request 4 buffers: Invalid argument
[0:11:29.854451623] [2124] ERROR Allocator framebuffer_allocator.cpp:97 Stream is not part of /base/soc/i2c0mux/i2c@1/imx708@1a active configuration
Can't allocate buffers
Failed to start camera session

Comment: the first two errors ERROR IPAModule ipa_module.cpp:172were always there - no problem

kbingham commented 1 month ago

pi@4:~ $ cam -c 1 --stream colorspace=YUV,pixelformat=BGR888,height=640,width=960 --capture=5 -Fcam.ppm
[0:11:29.723195959] [2124] ERROR IPAModule ipa_module.cpp:172 Symbol ipaModuleInfo not found

...

Comment: the first two errors ERROR IPAModule ipa_module.cpp:172were always there - no problem

Bah, is that still happening - I thought I supplied a fix to silence that a long time ago. But - I can confirm - this error can be ignored.

6by9 commented 1 month ago

The max is correct.

As per the V4L2 docs at https://www.kernel.org/doc/html/latest/userspace-api/media/v4l/pixfmt-v4l2.html

Both applications and drivers can set this field to request padding bytes at the end of each line. Drivers however may ignore the value requested by the application, returning width times bytes per pixel or a larger value required by the hardware. That implies applications can just set this field to zero to get a reasonable default.

Prior to https://github.com/raspberrypi/linux/commit/55faf2eb5190410941aaabdc5bbd2c74c27f546e the driver ignored the value, but that has limitations in itself. As above, not ignoring the value is perfectly acceptable from the spec perspective.

The question is why the VIDIOC_S_FMT is being called with bytesperline being set to a larger number than is desired.

24bpp formats are more tricky than many as 3bytes per pixel doesn't divide nicely for aligning to powers of 2.

Also bytesperline_align being 64 is correct. It is nothing to do with the CPU being 32 or 64 bit, but is whether the hardware block has restrictions on the addresses used to start a line.

2konrad commented 1 month ago

Thank you for the clarification.

The question is why the VIDIOC_S_FMT is being called with bytesperline being set to a larger number than is desired.

Indeed. It seems, that the submitted bytesperline value is some leftover value from the previous call to the same function. Unfortunately i am very new to libcamera, so not of much help.

naushir commented 1 month ago

I could be wrong, but I don't think the issue is related to wrong buffer strides. The difference in stride values calculated in the driver and libcamera is inconsequential as that's purely a try fmt call. During the set_fmt the values are correct. This matches the python script output above.

However, I have replicated the error in cam about failure to queue buffers. That's something to definitely look at.

naushir commented 1 month ago

cam -c 1 --stream colorspace=YUV,pixelformat=BGR888,height=640,width=960 --capture=5 -Fcam.ppm

seems to fail for me, but

cam -c 1 --stream height=640,width=960 --capture=5 -Fcam.ppm

seems fine.

2konrad commented 1 month ago

cam without format options uses a 4 byte format XRGB8888

cam -c 1 --stream height=640,width=960 --capture=5 -Fcam.ppm

produces error: "Only BGR888 output pixel format is supported (XRGB8888 requested)"

cam -c 1 --stream height=640,width=960 --capture=5 -F

works and saves a 4-byte pixel format. However

cam -c 1 --stream height=240,width=320 --capture=5 -F

is producing files that are way too large.

In fact what happens is: The VIDIOC_S_FMT is called first with bytesperline=0and returned 3200 bytesperline, independent of the requested width. So the first call seems to always assume a width of 800 pixel - maybe a standard setting.

Thereafter VIDIOC_S_FMT is called again with bytesperline=3200, and the driver calculates 1280 (=320*4) but returns 3200. So for width with corresponding bytesperline<3200 the bytesperline are wrong (too high)

The problem seems to be that the bytesperline from first call are used for subsequent calls, instead just setting bytesperline=0 before each call to VIDIOC_S_FMT from libcamera.

naushir commented 1 month ago

I think there may be 2 things going wrong here:

1) cam not setting/clearing stride value The cam app requests a default scream config from the pipeline handler. This comes with a stride of 3200 set in the StreamConfiguration structure. It then updates the width/height/pixelformat in the StreamConfiguration to 960x480 RGB888, but leaves the stride set to 3200. Since 3200 > 2880 (minimum stride required for the format), the driver uses the user requested stride, and over allocates. Perhaps cam should unconditionally set stride to 0 in StreamConfiguration so that the driver can decide what is the best choice?

2) Something in MMAL is upset with a stride of 3200 The cam selected stride of 3200 ought to work, since it is > 2880 minium stride, but the MMAL component seems to not like this:

60696.982354] bcm2835-isp bcm2835-isp: bcm2835_isp_node_s_fmt: Set format for node output[0]
[60696.982371] bcm2835-isp bcm2835-isp: populate_qdata_fmt: Setting pix format for type 2, wxh: 1920x1080, fmt: 41414270, size 2592000
[60696.982378] bcm2835-isp bcm2835-isp: populate_qdata_fmt: Calculated bpl as 2400, size 2592000
[60696.982547] bcm2835-isp bcm2835-isp: populate_qdata_fmt: Set format for type 2, wxh: 1920x1080, fmt: 41414270, size 2592000
[60696.982755] bcm2835-isp bcm2835-isp: bcm2835_isp_node_s_fmt: Set format for node capture[0]
[60696.982761] bcm2835-isp bcm2835-isp: populate_qdata_fmt: Setting pix format for type 1, wxh: 960x640, fmt: 33424752, size 2048000
[60696.982766] bcm2835-isp bcm2835-isp: populate_qdata_fmt: Calculated bpl as 3200, size 2048000
[60696.982879] bcm2835-isp bcm2835-isp: populate_qdata_fmt: Current buffer size of 2048000 < min buf size 2088960 - driver mismatch to MMAL
[60696.982884] bcm2835-isp bcm2835-isp: populate_qdata_fmt: Set format for type 1, wxh: 960x640, fmt: 33424752, size 2048000
[60696.983034] bcm2835-isp bcm2835-isp: bcm2835_isp_node_s_fmt: Set format for node capture[1]
[60696.983040] bcm2835-isp bcm2835-isp: populate_qdata_fmt: Setting pix format for type 1, wxh: 480x320, fmt: 32315559, size 245760
[60696.983044] bcm2835-isp bcm2835-isp: populate_qdata_fmt: Calculated bpl as 512, size 245760
[60696.983150] bcm2835-isp bcm2835-isp: populate_qdata_fmt: Set format for type 1, wxh: 480x320, fmt: 32315559, size 245760
[60696.983175] bcm2835-isp bcm2835-isp: bcm2835_isp_node_s_fmt: Set format for node stats[2]
[60696.983179] bcm2835-isp bcm2835-isp: populate_qdata_fmt: Setting meta format for fmt: 41545342, size 10824
[60696.983182] bcm2835-isp bcm2835-isp: populate_qdata_fmt: Calculated bpl as 0, size 10824
[60696.983284] bcm2835-isp bcm2835-isp: populate_qdata_fmt: Set format for type 13, wxh: 0x0, fmt: 41545342, size 10824
[60696.985681] bcm2835-isp bcm2835-isp: setup_mmal_port: setup capture[0]
[60696.985868] bcm2835-isp bcm2835-isp: buffer size mismatch sizeimage 2048000 < min size 2088960

MMAL wants a stride of 3264 (2088960 / 640 == 3264) for some reason and complains that the driver is set to 3200. I'm not sure how/why it wants 3240 as the stride - I'll have to check in the firmware what it is doing. @6by9 any thoughts? From the ISP driver code:

    setup_mmal_port_format(node, node->port);
    ret = vchiq_mmal_port_set_format(dev->mmal_instance, node->port);
    if (ret) {
        v4l2_err(&dev->v4l2_dev,
             "%s: Failed vchiq_mmal_port_set_format on port, ret %d\n",
             __func__, ret);
        ret = -EINVAL;
    }

    if (q_data->sizeimage < node->port->minimum_buffer.size) {
        v4l2_err(&dev->v4l2_dev,
             "%s: Current buffer size of %u < min buf size %u - driver mismatch to MMAL\n",
             __func__,
             q_data->sizeimage,
             node->port->minimum_buffer.size);
    }

2konrad commented 1 month ago

Thanks naushir,

I was able to solve this issue for me in my application by setting stride = 0 before camera configuration.

2konrad commented 1 month ago

just saw that the rpicam-app fixed the issue with that commit

6by9 commented 1 month ago

2. Something in MMAL is upset with a stride of 3200
   The cam selected stride of 3200 ought to work, since it is > 2880 minium stride, but the MMAL component seems to not like this:

60696.982354] bcm2835-isp bcm2835-isp: bcm2835_isp_node_s_fmt: Set format for node output[0]
[60696.982371] bcm2835-isp bcm2835-isp: populate_qdata_fmt: Setting pix format for type 2, wxh: 1920x1080, fmt: 41414270, size 2592000
[60696.982378] bcm2835-isp bcm2835-isp: populate_qdata_fmt: Calculated bpl as 2400, size 2592000
[60696.982547] bcm2835-isp bcm2835-isp: populate_qdata_fmt: Set format for type 2, wxh: 1920x1080, fmt: 41414270, size 2592000
[60696.982755] bcm2835-isp bcm2835-isp: bcm2835_isp_node_s_fmt: Set format for node capture[0]
[60696.982761] bcm2835-isp bcm2835-isp: populate_qdata_fmt: Setting pix format for type 1, wxh: 960x640, fmt: 33424752, size 2048000
[60696.982766] bcm2835-isp bcm2835-isp: populate_qdata_fmt: Calculated bpl as 3200, size 2048000
[60696.982879] bcm2835-isp bcm2835-isp: populate_qdata_fmt: Current buffer size of 2048000 < min buf size 2088960 - driver mismatch to MMAL
[60696.982884] bcm2835-isp bcm2835-isp: populate_qdata_fmt: Set format for type 1, wxh: 960x640, fmt: 33424752, size 2048000
[60696.983034] bcm2835-isp bcm2835-isp: bcm2835_isp_node_s_fmt: Set format for node capture[1]
[60696.983040] bcm2835-isp bcm2835-isp: populate_qdata_fmt: Setting pix format for type 1, wxh: 480x320, fmt: 32315559, size 245760
[60696.983044] bcm2835-isp bcm2835-isp: populate_qdata_fmt: Calculated bpl as 512, size 245760
[60696.983150] bcm2835-isp bcm2835-isp: populate_qdata_fmt: Set format for type 1, wxh: 480x320, fmt: 32315559, size 245760
[60696.983175] bcm2835-isp bcm2835-isp: bcm2835_isp_node_s_fmt: Set format for node stats[2]
[60696.983179] bcm2835-isp bcm2835-isp: populate_qdata_fmt: Setting meta format for fmt: 41545342, size 10824
[60696.983182] bcm2835-isp bcm2835-isp: populate_qdata_fmt: Calculated bpl as 0, size 10824
[60696.983284] bcm2835-isp bcm2835-isp: populate_qdata_fmt: Set format for type 13, wxh: 0x0, fmt: 41545342, size 10824
[60696.985681] bcm2835-isp bcm2835-isp: setup_mmal_port: setup capture[0]
[60696.985868] bcm2835-isp bcm2835-isp: buffer size mismatch sizeimage 2048000 < min size 2088960

MMAL wants a stride of 3264 (2088960 / 640 == 3264) for some reason and complains that the driver is set to 3200. I'm not sure how/why it wants 3240 as the stride - I'll have to check in the firmware what it is doing. @6by9 any thoughts? From the ISP driver code:

As I said, 24bpp formats are a pain due to multiples of 3 vs multiples of powers of 2 and using bitmasking.

vc_image wants minimum ALIGN_UP(width, 16) * 3, ie multiples of 48 bytes.
ISP hw wants a multiple of 32 bytes.
MMAL for RGB888/BGR888 should be working on an alignment of 32 pixels (ie 96 bytes)

3200 is not a multiple of 96, so is rejected by MMAL. The V4L2 shim appears to have defined only an alignment to 32 bytes for these formats, so it will incorrectly compute the stride. As it's currently a bitmask, it's not possible to simply tweak the alignment value, and needs an alternate computation path.

naushir commented 1 month ago

I was able to solve this issue for me in my application by setting stride = 0 before camera configuration.

It might be worth posting the patch to cam on the libcamera mailing list.

naushir commented 1 month ago

I was able to solve this issue for me in my application by setting stride = 0 before camera configuration.

It might be worth posting the patch to cam on the libcamera mailing list.

On second thoughts, leave it to me, I've got a patch ready.

naushir commented 1 month ago

Patch has been posed, so I'll resolve this issue now.

raspberrypi / libcamera

Since 6.6.31 Kernel: FramebufferAllocator allocates too much space #138