CESNET / UltraGrid

UltraGrid low-latency audio and video network transmission system
http://www.ultragrid.cz
Other
499 stars 53 forks source link

force-lavd-decoder=hevc_qsv outputting as v210 via Decklink #295

Closed alatteri closed 1 year ago

alatteri commented 1 year ago

Hello,

Using force-lavd-decoder=hevc_qsv on receiver is outputting as v210 via Decklink even though input signal is R12L. This means that a full range signal is outputting as limited/legal.

Input signal 2K 12bit RGB 444 SDI.

Screenshot 2023-02-18 at 2 32 50 PM

See logs below.

Encoder: ./UltraGrid-498605d-x86_64.AppImage -t decklink -c libavcodec:encoder=hevc_qsv 10.55.121.22

[1676759630.171] [Decklink capture] Format change detected (display mode, color space - RGB444, 12bit).
[1676759630.171] [Decklink capture] Detected 12-bit signal, use ":codec=UYVY" to enforce 8-bit capture (old behavior).
[1676759630.171] [Decklink capture] Using codec: R12L
[1676759630.171] [DeckLink capture] Enable video input: 2Kp24 DCI
[1676759630.237] Waiting for new frame timed out!
[1676759630.289] [lavc] Using codec: H.265, encoder: hevc_qsv
[1676759630.289] [lavc] Blacklisting x2rgb10le because there has been issues with this pixfmt and current encoder (hevc_qsv) , use '--param lavc-use-codec=x2rgb10le' to enforce.
[1676759630.359] [lavc] Selected pixfmt: xv30le

Receiver: ./UltraGrid-498605d-x86_64.AppImage -d decklink --param force-lavd-decoder=hevc_qsv

[video dec.] New incoming video format detected: 2048x1080 @24.00p, codec H.265
[lavc hevc_qsv @ 0x7fc0c0005200] Invalid pkt_timebase, passing timestamps as-is.
[lavd] Using decoder: hevc_qsv
[Decklink display] Setting single link by default.
[display] Successfully reconfigured display to 2048x1080 @24.00p, codec v210
[lavd] Selected pixel format: xv30le
MartinPulec commented 1 year ago

Well, it depends. the QuickSync encodes 10-bit YUV 4:4:4, so UltraGrid has to choose some of DeckLink output formats - either v210 or R10k are best suitable. Maybe R10k would be slightly better since it has 4:4:4 subsampling, but I believe that R10k is from DeckLink perspective also limited-range?

R12L could not have been chosen by default, anyways, because R10k matches the input parameters better. Maybe I could add some option to prefer color range in the selection at the expense of bit depth. But I believe that it should not be the default behavior.

alatteri commented 1 year ago

Yes, R10k is much prefered over v210 (4:2:2) when input is 10bitYUV444. Since UG uses its own packet format, would it be possible to add some additional metadata indicating the original color space to the receiver, instead of only using only codec format for this type of logic?

MartinPulec commented 1 year ago

R10k is much prefered over v210 (4:2:2) when input is 10bitYUV444

agree, I think it could be implemented

Since UG uses its own packet format, would it be possible to add some additional metadata indicating the original color space to the receiver, instead of only using only codec format for this type of logic?

Generally speaking yes. Anyways, what would be the target use case? Having a option/switch to enforce the use of the input codec? I think that there are also reasons for the current behavior, namely that color conversion is costly.

alatteri commented 1 year ago

Hi Martin,

In this exact scenario where input is R12L (12bit444 Full Range), codec (hevc_qsv) is 10butYUV444, the Decklink output should be R12L also, but because of codec, decoder at best would choose R10K Limited, at worst (current behavior) v210. If source (R10K, R10K Full, R12L) was signaled with the stream, decoder could make the better choice of display mode.

But maybe the now fixed RGB modes of hevc_qsv will fix all this?

MartinPulec commented 1 year ago

I've already pushed part of the fix. Currently in this scenario, the deduced pixel format is even R12L. But it is only by chance because the selection is made according to detected internal properties, which is currently done by codec tag (Y416 here), which doesn't have sufficient granularity. So R10k will be correct then.

We'll need to discuss the rest internally.

But maybe the now fixed RGB modes of hevc_qsv will fix all this?

Unfortunately no, the current Intel driver implementation converts this to YCbCr still, the actual fix was just to mark the stream correctly as YCbCr. Hopefully the conversion will be at least HW-accelerated. Just FYI, as the FF fix is already merged upstream, I've allowed the codec in e03016b0. As for the native RGB support, it may be a feature request for intel/media-driver? I believe from this response, that this is actually outside the cartwheel-ffmpeg team scope.

MartinPulec commented 1 year ago

I've already pushed part of the fix.

The rest is fixed in 0654e997. The deduction now works according to selected conversion policy (--conv-policy; depth->subsampling->CS by default). So the deduced pixfmt is now R10k here.

We'll need to discuss the rest internally.

We'll perhaps keep it open for now. The thing is that I believe that in this particular case, selecting R10k could be considered correct (certainly it depends on a point of view), if we put aside the DeckLink different range for and R12L, which could be considered a separate issue. Let's assume that keeping exact pixfmt could be a receiver-side option. But if does, it would just replace --param decoder-use-codec=R12L. Only advantage I can see over this is when it is expected the signal changes during the transmission from time to time or user is uncertain which signal arrives (but needs keeping it, anyways).

MartinPulec commented 1 year ago

Well, I've thought about it once more and in theory there would be a possibility how to signalize the source picture format to the receiver without altering the pixel format or using some side channel (like RTCP).

I could pass the metadata directly to the HEVC (or H.264) stream as a SEI NAL unit and then extract it on the receiver. It would be initially limited to H.264/HEVC only, since it is codec specific (it would be certainly possible also for JPEG). Anyways, since we'd perhaps not want it as a default behavior, it would be needed to be enabled explicitly, with something like --param keep-pixfmt (on sender).

What do you think? Would it be useful for you at least in this extent?

alatteri commented 1 year ago

Hi Martin, I haven't yet had a chance to test the fixes in 0654e99. I will this weekend. But yeah, the concept sounds good. INPUT=OUTPUT.

MartinPulec commented 1 year ago

to be enabled explicitly, with something like --param keep-pixfmt (on sender).

It is already implemented. Parameter is the one mentioned. Works for H.264/HEVC for now (any encoder/decoder, I am attaching the manually to the raw stream). It should be possible to extend it to most of the compressions (in theory), but it is not entirely transparent, because we are attaching it to raw data, so it is codec-specific (I know how it could be done for MJPEG but not for other codecs).