intel / libxcam

libXCam is a project for extended camera(not limited in camera) features and focus on image quality improvement and video analysis. There are lots features supported in image pre-processing, image post-processing and smart analysis. This library makes GPU/CPU/ISP working together to improve image quality. OpenCL is used to improve performance in different platforms.
Other
590 stars 230 forks source link

How do I process live camera inputs to generate surround view? #799

Open mikqs opened 2 years ago

mikqs commented 2 years ago

The test-surround-view only works with file inputs. How can I do live stitching with a live stream (like "appsink")? If I send the GStreamer stream to a file, this file grows very fast and I cannot handle its size. I am not sure how to read the inputs in a live fashion.

Thank you

brmarkus commented 2 years ago

Can you describe a bit more your environment or requirements? Which camera are you talking about, getting the stream via IP/Ethernet, USB, or using MIPI? Do you have to use a specific programming language? Do you prefer to write an application or using a gstreamer pipeline? If your camera(s) is supported by e.g. "v4l2" then you could use gstreamer, getting the stream(s) via "v4l2src" plugin, optionally doing preprocessing (scaling, format conversion, etc.) and feed it into libxcam. At some place (abstraction) your application is agnostic to where the input is coming from (from a local file, a network video-stream) and just continues with a representation of a single frame (a memory area, 2d array of pixels, pointer, libva-handle, prime-fd) - directly handed-over from a decoder or file-reader or taken from a queue/buffer.

mikqs commented 2 years ago

Can you describe a bit more your environment or requirements? Which camera are you talking about, getting the stream via IP/Ethernet, USB, or using MIPI? Do you have to use a specific programming language? Do you prefer to write an application or using a gstreamer pipeline?

Thank you for your quick reply! I am using a MIPI module of 6 cameras connecting to an NVIDIA Jetson Xavier AGX. I want to use 4 of these cameras to generate a surround view from a small autonomous vehicle. I am using Python but can use C++ if needed, on Ubuntu 18.04. I wanted to test the tool using a gstreamer pipeline for now.

If your camera(s) is supported by e.g. "v4l2" then you could use gstreamer, getting the stream(s) via "v4l2src" plugin, optionally doing preprocessing (scaling, format conversion, etc.) and feed it into libxcam.

I am new to gstreamer and was using the below pipeline. Does nvarguscamerasrc replace v4l2src ? The pipeline is converting the stream from camera sensor_id to NV12 format and sends it to an appsink (I am able to view it in OpenCV), however I am not sure how to make libxcam read this feed. I also tried using filesink location=input0.nv12 and this time libxcam was able to read the files, but as I mentioned in the issue, these files grow very fast and I cannot handle their size.

At some place (abstraction) your application is agnostic to where the input is coming from (from a local file, a network video-stream) and just continues with a representation of a single frame (a memory area, 2d array of pixels, pointer, libva-handle, prime-fd) - directly handed-over from a decoder or file-reader or taken from a queue/buffer.

Would you have suggestion on redirecting the output so that test-surround-view can properly read the stream from the --input arguments ? I am very new to these tools and principles so I apologize if my understanding is flawed, but willing to learn.

def gstreamer_pipeline(
    sensor_id,
    capture_width=1920,
    capture_height=1080,
    display_width=1920,
    display_height=1080,
    framerate=30,
    flip_method=0,
):
    return (
        "nvarguscamerasrc sensor-id=%d !"
        "video/x-raw(memory:NVMM), width=(int)%d, height=(int)%d, framerate=(fraction)%d/1 ! "
        "nvvidconv flip-method=%d ! "
        "video/x-raw, width=(int)%d, height=(int)%d, format=(string)NV12 ! "
        "videoconvert ! "
        "video/x-raw, format=(string)NV12 ! appsink"
        % (
            sensor_id,
            capture_width,
            capture_height,
            framerate,
            flip_method,
            display_width,
            display_height
        )
    )
brmarkus commented 2 years ago

The application test-surround-view is just a sample, expecting a local file in NV12 format. You could modify the file accordingly - to get the content from an appsink via gstreamer or using a network Socket or any other way. You might want to check the "Tests" document under "https://github.com/intel/libxcam/wiki/Tests", e.g. the section "https://github.com/intel/libxcam/wiki/Tests#9-xcamfilter-plugin" for the gstreamer plugin xcamfilter. With that you can easily extend your existing gstreamer pipeline. But, of course, you can also register callbacks to certain pads to ingest input data from where ever you get your data from.

zongwave commented 2 years ago

We implemented a xcam video filter for FFmpeg, exported some functions (https://github.com/intel/libxcam/blob/master/capi/context_priv.h#L31) which can be used through FFmpeg command line like this. For surround view application, our implementation only support stitch four cameras. you need to specify parameters and provide camera calibration files also.

For some reasons we did not submit this xcam video filter to FFMpeg community successfully, if you want to try I can send this filter to you through email.

 $ ffmpeg -i input0.mp4 -i input1.mp4 -i input2.mp4 -i input3.mp4 -filter_complex "xcam=inputs=4:name=stitch:w=1920:h=640:fmt=nv12:params=help=1 module=soft cammodel=camb4c1080p fisheyenum=4 levels=1 dewarp=bowl scale=dualconst fm=capi fmframes=120 fmstatus=fmfirst scopic=mono" soft-stitch.mp4
Randrianasulu commented 11 months ago

xcam ffmpeg submission lives at https://patchwork.ffmpeg.org/project/ffmpeg/patch/20200731145710.114479-1-wei.zong@intel.com/#56952 not dure how applicable it to current ffmpeg git tree .....