Video capture yuv420 to rgb

thedodd commented 1 year ago

Hey folks, stoked to see this project here. I have a use case for which I would like to start using this project, instead of directly using libcamera-raw/libcamera-vid.

In essence, I would like to capture video off of my picamera 2, and convert that data to rgb format. Out of the box, libcamera-vid supports few formats and yuv420 seems to be the base format coming out of the ISP.

Do folks have an recommended way of handling this using this library? Perhaps the mjpeg video example which is already in the repo is the best place to start, and:

instead of mjpeg, use yuv420,
when a frame is received, manually convert from yuv420 to rgb,
emit convert rgb output from there.

Thoughts?

chemicstry commented 1 year ago

Hi,

If I remember correctly, raspberry ISP should support RGB output. You can use list_cameras example from this repository and it will print all supported formats. I think the official libcamera-apps also have an option to print supported formats.

If RGB is supported, then you can just set the format to RGB instead of MJPEG in video example and it should work. Although I haven't done much testing on formats with multiple data planes (MJPEG has just one, while RGB has 3), so report if you encounter any issues.

If RGB is not supported then yes, I think you can just capture YUV420 and convert it yourself. This will be slower than with ISP, but it should still be much faster than encoding/compressing.

frederikstroem commented 8 months ago

Hey folks, stoked to see this project here. I have a use case for which I would like to start using this project, instead of directly using libcamera-raw/libcamera-vid.

In essence, I would like to capture video off of my picamera 2, and convert that data to rgb format. Out of the box, libcamera-vid supports few formats and yuv420 seems to be the base format coming out of the ISP.

Do folks have an recommended way of handling this using this library? Perhaps the mjpeg video example which is already in the repo is the best place to start, and:

instead of mjpeg, use yuv420,

when a frame is received, manually convert from yuv420 to rgb,

emit convert rgb output from there.

Thoughts?

I am using the Raspberry Pi Camera Module 2 (8MP) [IMX219] on a Raspberry Pi 4 with 8GB RAM. I was unable to get MJPEG working with the camera and found RGB888 to be the best alternative for my use case. However, I did find the following formats available for this camera:

NV21
YUV420
NV12
XRGB8888
BGR888
RGB888
XRGB8888
RGB565
YVYU
YUYV
VYUY
UYVY

To get the camera working with RGB888, I modified the jpeg_capture.rs example to utilize RGB888 instead of MJPEG.

use drm_fourcc::DrmFourcc;

use image::RgbImage;
use image::ColorType;
use image::ImageBuffer;

use std::time::Duration;

use libcamera::{
    camera::CameraConfigurationStatus,
    camera_manager::CameraManager,
    framebuffer::AsFrameBuffer,
    framebuffer_allocator::{FrameBuffer, FrameBufferAllocator},
    framebuffer_map::MemoryMappedFrameBuffer,
    pixel_format::PixelFormat,
    properties,
    stream::StreamRole,
    geometry::Size,
};

const PIXEL_FORMAT_RGB888: PixelFormat = PixelFormat::new(DrmFourcc::Rgb888 as u32, 0);

fn main() {
    let filename = "rgb888_capture.png";

    // Set size.
    let size = Size { width: 800, height: 600 };

    // Initialize the camera manager and select the first camera.
    let mgr = CameraManager::new().unwrap();
    let cameras = mgr.cameras();
    let cam = cameras.get(0).expect("No cameras found");
    println!(
        "Using camera: {}",
        *cam.properties().get::<properties::Model>().unwrap()
    );

    // Acquire the camera.
    let mut cam = cam.acquire().expect("Unable to acquire camera");

    // This will generate default configuration for each specified role.
    let mut cfgs = cam.generate_configuration(&[StreamRole::ViewFinder]).unwrap();

    // Set the desired pixel format and size for the configuration.
    cfgs.get_mut(0).unwrap().set_pixel_format(PIXEL_FORMAT_RGB888);
    cfgs.get_mut(0).unwrap().set_size(size);

    // Print the generated configuration.
    println!("Generated config: {:#?}", cfgs);

    // Validate the generated configuration.
    match cfgs.validate() {
        CameraConfigurationStatus::Valid => println!("Camera configuration valid!"),
        CameraConfigurationStatus::Adjusted => println!("Camera configuration was adjusted: {:#?}", cfgs),
        CameraConfigurationStatus::Invalid => panic!("Error validating camera configuration"),
    }
    cam.configure(&mut cfgs).expect("Unable to configure camera");

    // Allocate frame buffers for the stream.
    let mut alloc = FrameBufferAllocator::new(&cam);
    let stream = cfgs.get(0).expect("Failed to get stream configuration").stream().expect("Failed to get stream");
    let buffers = alloc.alloc(&stream).expect("Failed to allocate buffers");
    println!("Allocated {} buffers", buffers.len());

    // Convert FrameBuffer to MemoryMappedFrameBuffer, which allows reading &[u8]
    let buffers = buffers
        .into_iter()
        .map(|buf| MemoryMappedFrameBuffer::new(buf).expect("Failed to map framebuffer"))
        .collect::<Vec<_>>();

    // Create capture requests and attach buffers.
    let mut reqs = buffers.into_iter().map(|buf| {
        let mut req = cam.create_request(None).expect("Failed to create request");
        req.add_buffer(&stream, buf).expect("Failed to add buffer to request");
        req
    }).collect::<Vec<_>>();

    // Completed capture requests are returned as a callback.
    let (tx, rx) = std::sync::mpsc::channel();
    cam.on_request_completed(move |req| {
        tx.send(req).expect("Failed to send completed request");
    });

    // Start the camera and queue a single capture request.
    cam.start(None).expect("Failed to start camera");
    cam.queue_request(reqs.pop().expect("Failed to pop request")).expect("Failed to queue request");

    println!("Waiting for camera request execution");
    let req = rx.recv_timeout(Duration::from_secs(2)).expect("Camera request failed");

    println!("Camera request {:?} completed!", req);
    println!("Metadata: {:#?}", req.metadata());

    // Retrieve and process the framebuffer data from the completed request.
    let framebuffer: &MemoryMappedFrameBuffer<FrameBuffer> = req.buffer(&stream).expect("Failed to get framebuffer");
    println!("FrameBuffer metadata: {:#?}", framebuffer.metadata());

    // RGB data is interleaved as a single plane.
    let planes = framebuffer.data();
    let rgb_plane = match planes.get(0) {
        Some(plane) => plane,
        None => {
            eprintln!("RGB data is not available");
            return;
        }
    };

    // Create an ImageBuffer from the raw RGB data
    let img: RgbImage = ImageBuffer::from_raw(size.width, size.height, rgb_plane.to_vec()).expect("Unable to form the image buffer");

    // Save the buffer to a PNG file.
    image::save_buffer(&filename, &img, size.width, size.height, ColorType::Rgb8).unwrap();

    println!("PNG file saved to {}", &filename);

    // Everything is cleaned up automatically by Drop implementations.
}

Make sure to include drm-fourcc, image, and, lastly, libcamera manually via Git (for the Raspberry Pi, see issue #33) in the Cargo.toml file.

[package]
name = "app"
version = "0.1.0"
edition = "2021"

[dependencies]
drm-fourcc = "2.2"
image = "0.24.7"
libcamera = { git = "https://github.com/lit-robotics/libcamera-rs.git" }

What threw me off at first was that the RGB data is interleaved as a single plane, rather than being split into three separate planes for each color channel, as @chemicstry hinted. For more insight, see here (archive).

P.S. @thedodd, I'm aware that you were referring to video, but I haven't reached that part of my project yet. Nevertheless, I thought sharing my findings might be helpful, as they could also apply to the video implementation.

Skgland commented 7 months ago

A note regarding rgb_plane.to_vec(), this won't work properly if cfgs.get(0).unwrap().get_stride() > size.width * 3. In that case stride - (size.width * 3) bytes need to be skipped at the end of every stride chunk.

E.g.

let stride = cfgs.get(0).unwrap().get_stride()
let image_data = rgb_plane
        .chunks(stride as usize)
        .flat_map(|elem| &elem[..(size.width * 3) as usize])
        .copied()
        .collect();
let img: RgbImage = ImageBuffer::from_vec(size.width, size.height, image_data).expect("Unable to form the image buffer");

alexisgaziello commented 4 months ago

Hi, I am using the camera module2 and raspberry pi and converting the NV12 format via dcv-color-primitives. NV12 is a YUV format so this might help.

Code example:

use dcv_color_primitives as dcp;
use dcp::{convert_image, ColorSpace, ImageFormat, PixelFormat};
use image::{ColorType, ImageEncoder};

pub(crate) fn nv12_to_rgb(planes: &Vec<&[u8]>, width: u32, height: u32) -> Result<Vec<u8>, &'static str> {
    let src_format = ImageFormat {
        pixel_format: PixelFormat::Nv12,
        color_space: ColorSpace::Bt601,
        num_planes: 2, // NV12 has two planes: Y (luminance) and UV (chrominance)
    };

    let dst_format = ImageFormat {
        pixel_format: PixelFormat::Rgb,
        color_space: ColorSpace::Rgb,
        num_planes: 1,
    };

    let mut dst_data = vec![0u8; (3 * width * height) as usize]; // RGB output

    convert_image(
        width,
        height,
        &src_format,
        None,
        &[planes[0], planes[1]], // Y and UV planes
        &dst_format,
        None,
        &mut [&mut dst_data],
    ).map_err(|_| "Failed to convert NV12 to RGB")?;

    Ok(dst_data)
}

Which you can use by just passing the buffer data, taking the example from the repo:

    // For NV12, we expect two data planes: Y (luminance) and UV (chrominance)
    let planes = framebuffer.data();

    // Convert NV12 to RGB
    let rgb_data = nv12_to_rgb(&planes, width, height).unwrap();

Its my first time writing rust so take this code carefully.

lit-robotics / libcamera-rs

Video capture yuv420 to rgb #21