Open totaam opened 1 year ago
Now working for jpeg
decoding without OpenGL - which is not very useful since we have nvjpeg
(#3504) which does this but better in every way: more lightweight, decodes jpega
, with opengl acceleration (decoded image stays on the GPU).
But this is a start, hopefully I can figure out how to enable h264
, hevc
, vp8
, vp9
, etc..
Still TODO:
no device
errors when used as a video decoder for jpeg
- allocate the CUDA device once in the decode thread?nvjpeg
)libyuv
's NV12 to RGB
jpega
It paints... but needs a new shader: this one only paints the Y
plane for now. (better than a corrupted screen, or worse.. crashes)
Perhaps something like:
https://git.linuxtv.org/libcamera.git/tree/src/qcam/assets/shader/NV_2_planes_UV_f.glsl?id=9db6ce0ba499eba53db236558d783a4ff7aa3896
explained in Accelerating libcamera Qcam format conversion using OpenGL shaders
/* SPDX-License-Identifier: LGPL-2.1-or-later */
/*
* Copyright (C) 2020, Linaro
*
* NV_2_planes_UV_f.glsl - Fragment shader code for NV12, NV16 and NV24 formats
*/
#ifdef GL_ES
precision mediump float;
#endif
varying vec2 textureOut;
uniform sampler2D tex_y;
uniform sampler2D tex_u;
void main(void)
{
vec3 yuv;
vec3 rgb;
mat3 yuv2rgb_bt601_mat = mat3(
vec3(1.164, 1.164, 1.164),
vec3(0.000, -0.392, 2.017),
vec3(1.596, -0.813, 0.000)
);
yuv.x = texture2D(tex_y, textureOut).r - 0.063;
yuv.y = texture2D(tex_u, textureOut).r - 0.500;
yuv.z = texture2D(tex_u, textureOut).g - 0.500;
rgb = yuv2rgb_bt601_mat * yuv;
gl_FragColor = vec4(rgb, 1.0);
}
For now I have used this Y
to RGBA shader instead, completely ignoring the UV
combined plane:
struct pixel_in {
float2 texcoord1 : TEXCOORD0;
float2 texcoord2 : TEXCOORD1;
uniform samplerRECT texture1 : TEXUNIT0;
uniform samplerRECT texture2 : TEXUNIT1;
float4 color : COLOR0;
};
struct pixel_out {
float4 color : COLOR0;
};
pixel_out
main (pixel_in IN)
{
pixel_out OUT;
float3 pre;
pre.r = texRECT (IN.texture1, IN.texcoord1).x - (16.0 / 256.0);
pre.g = texRECT (IN.texture2, IN.texcoord2).x - (128.0 / 256.0);
pre.b = texRECT (IN.texture2, IN.texcoord2).y - (128.0 / 256.0);
const float3 red = float3 (1.0/219.0, 0.0, 1.371/219.0) * 255.0;
const float3 green = float3 (1.0/219.0, -0.336/219.0, -0.698/219.0) * 255.0;
const float3 blue = float3 (1.0/219.0, 1.732/219.0, 0.0) * 255.0;
OUT.color.r = pre.r;
OUT.color.g = pre.r;
OUT.color.b = pre.r;
OUT.color.a = IN.color.a;
return OUT;
}
More NV12
shader examples:
GL_RG
/ GL_UNSIGNED_BYTE
for UV
data TexImage2D
)GL_LUMINANCE_ALPHA
/ GL_UNSIGNED_BYTE
for UV
data TexImage2D
)For jpeg
mode, do we need to keep the decoder context? (initialization looks costly)
Trying to get nvdec
to work via gstreamer.
It finds the same valid encodings and chroma formats:
GST_DEBUG=nvdec*:4 gst-inspect-1.0 nvh264dec
0:00:00.902346129 655952 0x55f02fa992d0 INFO nvdecoder gstnvdecoder.c:1171:gst_nv_decoder_check_device_caps: mpegvideo bit-depth 8 with chroma format 0 [48 - 4080] x [16 - 4080]
0:00:00.908428330 655952 0x55f02fa992d0 INFO nvdecoder gstnvdecoder.c:1171:gst_nv_decoder_check_device_caps: mpeg2video bit-depth 8 with chroma format 0 [48 - 4080] x [16 - 4080]
0:00:00.914007625 655952 0x55f02fa992d0 INFO nvdecoder gstnvdecoder.c:1171:gst_nv_decoder_check_device_caps: mpeg4video bit-depth 8 with chroma format 0 [48 - 2032] x [16 - 2032]
0:00:00.920519713 655952 0x55f02fa992d0 INFO nvdecoder gstnvdecoder.c:1094:gst_nv_decoder_check_device_caps: No codec map corresponding to codec 3
0:00:00.921501455 655952 0x55f02fa992d0 INFO nvdecoder gstnvdecoder.c:1171:gst_nv_decoder_check_device_caps: h264 bit-depth 8 with chroma format 0 [48 - 4096] x [16 - 4096]
0:00:00.927115589 655952 0x55f02fa992d0 INFO nvdecoder gstnvdecoder.c:1171:gst_nv_decoder_check_device_caps: jpeg bit-depth 8 with chroma format 0 [64 - 32768] x [64 - 16384]
0:00:00.929860858 655952 0x55f02fa992d0 INFO nvdecoder gstnvdecoder.c:1171:gst_nv_decoder_check_device_caps: jpeg bit-depth 8 with chroma format 1 [64 - 32768] x [64 - 16384]
0:00:00.932570852 655952 0x55f02fa992d0 INFO nvdecoder gstnvdecoder.c:1094:gst_nv_decoder_check_device_caps: No codec map corresponding to codec 6
0:00:00.932581141 655952 0x55f02fa992d0 INFO nvdecoder gstnvdecoder.c:1094:gst_nv_decoder_check_device_caps: No codec map corresponding to codec 7
0:00:00.933641584 655952 0x55f02fa992d0 INFO nvdecoder gstnvdecoder.c:1171:gst_nv_decoder_check_device_caps: h265 bit-depth 8 with chroma format 0 [144 - 8192] x [144 - 8192]
0:00:00.934768056 655952 0x55f02fa992d0 INFO nvdecoder gstnvdecoder.c:1171:gst_nv_decoder_check_device_caps: h265 bit-depth 10 with chroma format 0 [144 - 8192] x [144 - 8192]
0:00:00.935946641 655952 0x55f02fa992d0 INFO nvdecoder gstnvdecoder.c:1171:gst_nv_decoder_check_device_caps: h265 bit-depth 12 with chroma format 0 [144 - 8192] x [144 - 8192]
0:00:00.939812713 655952 0x55f02fa992d0 INFO nvdecoder gstnvdecoder.c:1171:gst_nv_decoder_check_device_caps: vp8 bit-depth 8 with chroma format 0 [48 - 4096] x [16 - 4096]
0:00:00.945379570 655952 0x55f02fa992d0 INFO nvdecoder gstnvdecoder.c:1171:gst_nv_decoder_check_device_caps: vp9 bit-depth 8 with chroma format 0 [128 - 8192] x [128 - 8192]
Factory Details:
Rank primary (256)
Long-name NVDEC h264 Video Decoder
Klass Codec/Decoder/Video/Hardware
Description NVDEC video decoder
Author Ericsson AB, http://www.ericsson.com, Seungha Yang <seungha.yang@navercorp.com>
Plugin Details:
Name nvcodec
Description GStreamer NVCODEC plugin
Filename /usr/lib64/gstreamer-1.0/libgstnvcodec.so
Version 1.20.5
License LGPL
Source module gst-plugins-bad
Source release date 2022-12-19
Binary package Fedora GStreamer-plugins-bad package
Origin URL http://download.fedoraproject.org
GObject
+----GInitiallyUnowned
+----GstObject
+----GstElement
+----GstVideoDecoder
+----GstNvDec
+----nvh264dec
Pad Templates:
SINK template: 'sink'
Availability: Always
Capabilities:
video/x-h264
stream-format: byte-stream
alignment: au
profile: { (string)constrained-baseline, (string)baseline, (string)main, (string)high, (string)constrained-high, (string)progressive-high }
width: [ 48, 4096 ]
height: [ 16, 4096 ]
SRC template: 'src'
Availability: Always
Capabilities:
video/x-raw
width: [ 48, 4096 ]
height: [ 16, 4096 ]
framerate: [ 0/1, 2147483647/1 ]
format: { (string)NV12 }
video/x-raw(memory:GLMemory)
width: [ 48, 4096 ]
height: [ 16, 4096 ]
framerate: [ 0/1, 2147483647/1 ]
format: { (string)NV12 }
video/x-raw(memory:CUDAMemory)
width: [ 48, 4096 ]
height: [ 16, 4096 ]
framerate: [ 0/1, 2147483647/1 ]
format: { (string)NV12 }
Element has no clocking capabilities.
Element has no URI handling capabilities.
Pads:
SINK: 'sink'
Pad Template: 'sink'
SRC: 'src'
Pad Template: 'src'
Element Properties:
automatic-request-sync-point-flags: Flags to use when automatically requesting sync points
flags: readable, writable
Flags "GstVideoDecoderRequestSyncPointFlags" Default: 0x00000003, "corrupt-output+discard-input"
(0x00000001): discard-input - GST_VIDEO_DECODER_REQUEST_SYNC_POINT_DISCARD_INPUT
(0x00000002): corrupt-output - GST_VIDEO_DECODER_REQUEST_SYNC_POINT_CORRUPT_OUTPUT
automatic-request-sync-points: Automatically request sync points when it would be useful
flags: readable, writable
Boolean. Default: false
discard-corrupted-frames: Discard frames marked as corrupted instead of outputting them
flags: readable, writable
Boolean. Default: false
max-display-delay : Improves pipelining of decode with display, 0 means no delay (auto = -1)
flags: readable, writable
Integer. Range: -1 - 2147483647 Default: -1
max-errors : Max consecutive decoder errors before returning flow error
flags: readable, writable
Integer. Range: -1 - 2147483647 Default: 10
min-force-key-unit-interval: Minimum interval between force-keyunit requests in nanoseconds
flags: readable, writable
Unsigned Integer64. Range: 0 - 18446744073709551615 Default: 0
name : The name of the object
flags: readable, writable, 0x2000
String. Default: "nvh264dec0"
parent : The parent of the object
flags: readable, writable, 0x2000
Object of type "GstObject"
qos : Handle Quality-of-Service events from downstream
flags: readable, writable
Boolean. Default: true
And h264
works:
gst-launch-1.0 videotestsrc ! x264enc ! queue ! nvh264dec ! videoconvert ! autovideosink
Building nvdec
element from source for debugging is easy:
tar -Jxvf ~/Downloads/gst-plugins-bad-1.20.5.tar.xz
cd gst-plugins-bad-1.20.5/
mkdir build
cd build/
meson ..
ninja
sudo ninja install
rm -fr ~/.cache/gstreamer-1.0
GST_PLUGIN_PATH=/usr/local/lib64/gstreamer-1.0/ GST_DEBUG=nvdec*:5 gst-inspect-1.0 nvh264dec
Hoping to dump the data structures and using the sample data from https://github.com/Xpra-org/xpra/blob/5be5799b7e1b1cc5a5f982e6141b948582a6d9a0/xpra/codecs/codec_checks.py#L23-L37 then saving each frame to a different file, I can replay using:
gst-launch-1.0 multifilesrc location="%02d.h264" index=0 \
caps="video/x-h264,stream-format=byte-stream,alignment=nal,width=128,height=128,framerate=1/1" \
! avdec_h264 ! videorate ! autovideosink
But not with nvh264dec
or openh264
:
WARNING: erroneous pipeline: could not link multifilesrc0 to nvh264dec0
openh264dec
can work by adding an h264parse
element before it.
This one works with all decoders:
gst-launch-1.0 videotestsrc pattern=white ! x264enc ! nvh264dec ! videoconvert ! autovideosink
But only thanks to caps negotiation. Saving the frames using:
gst-launch-1.0 videotestsrc pattern=white
! video/x-raw,width=320,height=240
! x264enc
! multifilesink location="frame%d.h264"
Then trying to replay them with multifilesrc
as above hits the same issues again.
What makes it work is to specify format="NV12"
(or format="I420"
), otherwise we see:
0:00:00.231785272 714276 0x55563ad7e120 WARN GST_CAPS gstpad.c:5757:pre_eventfunc_check:<nvh264dec0:sink> caps video/x-h264, pixel-aspect-ratio=(fraction)1/1, width=(int)320, height=(int)240,
framerate=(fraction)30/1, chroma-format=(string)4:4:4, bit-depth-luma=(uint)10,
bit-depth-chroma=(uint)10, colorimetry=(string)bt601, parsed=(boolean)true,
stream-format=(string)byte-stream, alignment=(string)au, profile=(string)high-4:4:4, level=(string)1.3 not accepted
Correction, this works for openh264
without h264parse
(switching from alignment=nal
to alignment=au
):
gst-launch-1.0 multifilesrc location="frame%d.h264" index=0 \
caps="video/x-h264,stream-format=byte-stream,alignment=au,width=320,height=240,framerate=1/1" \
! openh264dec ! videorate ! autovideosink
And nvdec
loads from file:
vp8
:
gst-launch-1.0 multifilesrc location="frame%d.vp8" index=0 \
caps="video/x-vp8,stream-format=byte-stream,alignment=au,width=320,height=240" \
! nvvp8dec ! videoconvert ! autovideosink
h264
:
gst-launch-1.0 multifilesrc location="frame%d.h264" index=0 \
caps="video/x-h264,stream-format=byte-stream,alignment=au,width=320,height=240" \
! nvh264dec ! videoconvert ! autovideosink
Though that's only half the problem, the bigger issue is that for h264
, we must use a parser to populate the decoder's data structures. (things like h264.num_ref_idx_l1_active_minus1
!)
Not sure why the pfnDecodePicture
- aka parser_decode_callback
is called by the gstreamer decoder and not in our codec.. That's the one that provides the CUVIDPICPARAMS
.
Split from #202. There are other frameworks for hardware acceleration, but nvidia's offering is generally the most stable option.
https://docs.nvidia.com/video-technologies/video-codec-sdk/nvdec-video-decoder-api-prog-guide/