CESNET / UltraGrid

UltraGrid low-latency audio and video network transmission system
http://www.ultragrid.cz
Other
499 stars 53 forks source link

in-accurate color with qsv_hevc #341

Open alatteri opened 1 year ago

alatteri commented 1 year ago

Hello,

When using QSV_HEVC, the color is quite off. I'm wondering if this is a result of the RGB->YUV->RGB conversion.

Please watch the attached screen recording. Do you think there is any improvement that can be made?

Thanks.

Encoder: uv -m 1316 -t decklink:codec=R12L --audio-filter delay:0:frames -c libavcodec:encoder=hevc_qsv:rc=qvbr:bitrate=24M:cqp=20 -s embedded --audio-capture-format channels=8 --audio-codec=AAC:bitrate=192K --param incompatible -P 5004

Receiver: ./UltraGrid.AppImage --tool uv -d decklink:synchronized:drift_fix --audio-delay -20 -r analog --audio-channel-map 0:0,1:1,2:2,3:3,4:4,5:5,6:6,7:7 --audio-scale none -P 5004 --param use-hw-accel,resampler=soxr,decoder-use-codec=R12L,force-lavd-decoder=hevc_qsv

https://github.com/CESNET/UltraGrid/assets/625982/788e72ce-1244-4410-b246-14dd0b75d2aa

MartinPulec commented 1 year ago

Hi, would you be able to provide a minimal command demonstrating the problem? I've tried: uv -t testcard:c=R12L -c lavc:enc=hevc_qsv -d gl and the testcard didn't visually look to be any different than without the compression. Using QSV for decompress and forcing R12L output as well didn't seem to make a difference. UltraGrid selected x2rgb10le for the compression but it looks like QSV is converting it to YUV internally.

I am using a Raptor Lake-P card, anyways, if it makes some difference.

MartinPulec commented 1 year ago

Ok, after little evaluation (using -c <compress> -t testcard:c=R12L:pattern=blank=0x000000aa -d dummy -p color) it looks like uncompressed and libx265 compressed video has YUV values [46,110,202], while hevc_qsv with X2RGB [74,101,202][59,104,202]¹.

I've also recorded the stream and looked at the metadata with MediaInfo and it looks like QSV converts to full-range YCbCr – this is slightly unfortunate, because UG cannot currently handle that and treats it as limited-range.

¹ UPDATED 20-10-2023: original values may not have been correct, re-measured with current appimage 962ce6b and cmd -c lavc:e=hevc_qsv -t testcard:c=R12L:pattern=blank=0x000000aa -d dummy -p color --param lavc-use-codec=x2rgb10le

alatteri commented 1 year ago

Thanks for taking a look. I can also post source TIFF images of my patterns if it helps. Is there a fix possible for the issue?

Thanks, Alan

it looks like QSV converts to full-range YCbCr – this is slightly unfortunate, because UG cannot currently handle that and treats it as limited-range.

alatteri commented 1 year ago

attached are images sequences of both the RGB and mono luma ramps.

ramps.zip

MartinPulec commented 1 year ago

Well, I am not sure if QSV does the color-space conversion correctly, anyways, because:

ffmpeg -f lavfi -i color=#aa0000 -t 1 -pix_fmt xv30le -c:v hevc_qsv out.mp4

and

ffmpeg -f lavfi -i color=#aa0000 -t 1 -pix_fmt x2rgb10le -c:v hevc_qsv out.mp4

produce different levels - [170,0,0] for the first (correct) and [165,15,15] for the second when extracted to JPEG by FFmpeg (ffmpeg -i out.mp4 -frames 1 %d.jpg) and evaluated by ImageMagick (convert 1.jpg -resize 1x1 txt:-).[1] So it would be perhaps best to report to Intel. Also I could imagine that it won't be converted to YCbCr at all.

Anyways, I will disable X2RGB – the conversions will be done in SW by UG, so it would be less efficient but correct.

[1] but I am not 100% sure if FFmpeg handles correctly full-range YCbCr as well.... but the second case doesn't seem to me as full-range YCbCr interpreted as limited... actually it looks like the opposite - perhaps out.mp4 is actually limited but in container is incorrectly full-range?

alatteri commented 1 year ago

Hi Martin, Inline

So it would be perhaps best to report to Intel.

Sorry to ask this, but would you be able to do that? I don't believe I would have the proper technical knowledge to report that in a sensible concise manner to them.

Anyways, I will disable X2RGB – the conversions will be done in SW by UG, so it would be less efficient but correct.

Hmm... so what colorspace will be used for R12L --> QSV? I wonder if the SW conversion will still be realtime?

[1] but I am not 100% sure if FFmpeg handles correctly full-range YCbCr as well.... but the second case doesn't seem to me as full-range YCbCr interpreted as limited... actually it looks like the opposite - perhaps out.mp4 is actually limited but in container is incorrectly full-range?

I think using JPEG as the output format is not ideal for these testings as that is limited to 8bit, and once we goto 8bit the whole endeavor is useless anyway.

MartinPulec commented 1 year ago

I think using JPEG as the output format is not ideal for these testings as that is limited to 8bit, and once we goto 8bit the whole endeavor is useless anyway.

I've used it just to check if the values are shifted... you can use pnm, in which case it will use 16-bit samples. But then ImageMagick shows the values in percents, which seemed less clear to me.

alatteri commented 1 year ago

Hi Martin..

I had the formatting wrong in my last reply.... I'll post it here again just to make sure you saw it. Sorry if this is annoying.

So it would be perhaps best to report to Intel.

Sorry to ask this, but would you be able to do that? I don't believe I would have the proper technical knowledge to report that in a sensible concise manner to them.

Anyways, I will disable X2RGB – the conversions will be done in SW by UG, so it would be less efficient but correct.

Hmm... so what colorspace will be used for R12L --> QSV? I wonder if the SW conversion will still be realtime?

MartinPulec commented 11 months ago

would you be able to do that?

Maybe, but I don't want to make any commitment here – I am not 100% sure if the problem is really in QSV or somewhere between FFmpeg and QSV. Also if it were fixed, the true is that UG doesn't yet support full-range YCbCr, so it depends what will be the decoded output. Even the debugging might consume some time with uncertain result. So I may eventually try something but without promises.

Hmm... so what colorspace will be used for R12L --> QSV?

XV30 - it is a packed 10-bit 4:4:4 YCbCr format.

I wonder if the SW conversion will still be realtime?

Don't know - as always, it depends on combination of actual video properties and processing power. If it won't perform for you, it might be a motivation to get the x2rgb pixfmt working. If it will, I won't be really interested in solving it, also because if it is really a bug (outside UG), it may be fixed independently then.

MartinPulec commented 11 months ago

Well, it was quite difficult to tackle but I think I got it!

The problem is very simple in the end, see above. After struggling with it for few days, I finally got it – QSV uses BT.601 for RGB->YCbCr conversion! I've computed by hand the values for RGB=[170,0,0] and in limited YCbCr. It is BT.709=[47,110,202] and BT.601=[61,102,202]. Although there is a difference 2 for Y and Cb, I think that it is likely clear that Intel is using BT.601.

Observation # 2 is that the encoded stream uses actually limited-range YCbCr; but it (incorrectly?) writes to metadata value of AVCodecContext::color_range (JPEG - full, MPEG - limited), which UG sets to full-range (I believe that correctly, because at the codec input there is full-range RGB; limited range conversion makes just the codec). But this is currently no issue for UG, because it doesn't honor full-range flag for YCbCr in decoder (yet).

I'll may to look into it next week – either UG doesn't set the color-space or QSV doesn't honor the setting.

alatteri commented 11 months ago

Interesting. Thank you for looking into this. 10bit Full Range is the ideal goal. 0-1023.

MartinPulec commented 11 months ago

reported

Interesting. Thank you for looking into this. 10bit Full Range is the ideal goal. 0-1023.

It seems to me, that Intel is fixed to a limitted-range YCbCr BT.601, so I guess that we will end with this, anyways, but with correct metadata. This won't help for now, because UG YCbCR<->RGB conversions are currently expecting BT.709. hevc_qsv decoder doesn't offer RGB output, unfortunately (I thought that QSV could have converted to RGB back).

alatteri commented 11 months ago

Hu Martin, any thoughts on Intel's reply? https://github.com/intel/cartwheel-ffmpeg/issues/281

MartinPulec commented 2 days ago

This pull request seems to be for a query API. It looks like that it would allow also negotiation/setting parameters. For us, in the best case we would like to keep RGB unconverted and compress it directly. Anyways, it isn't merged yet and even though it would, it is a question when would be the consequent stuff.

Anyways, to move this slightly forward, I've made a workaround that has 2 parts:

  1. UltraGrid on the sender marks the stream explicitly as limited YCbCr BT.601 (commit e6179ec30)
  2. the information then is extracted by the UltraGrid receiver and the YCbCr->RGB coefficients according to BT.601 are used

There are some limitations - obviously when the output is displayed as a YCbCr, the display would need to treat it as BT.601 by itself. This can be toggled on for GL/SDL2 but not for DeckLink (I believe that it can be set to metadata but I doubt that the displaying device would honor that). Nevertheless, converting postprocessor shall be possible.

alatteri commented 2 days ago

Hi Martin,

thanks for taking a look at this again.

Would this actually affect the image data?. ie. Go from RGB Full --> limited YCbCr BT.601 --> RGB Full or is this just tricking things with metadata? In the first case, then it wouldn't matter anyway, as such a manipulation of the image data would likely corrupt the integrity of the image beyond acceptable quality levels.

alatteri commented 2 days ago

also, would the new Vulkan encoder capabilities in FFMPEG 7.1 overcome this issue while still providing hardware acceleration?

MartinPulec commented 1 day ago

Would this actually affect the image data?. ie. Go from RGB Full --> limited YCbCr BT.601 --> RGB Full or is this just tricking things with metadata? In the first case, then it wouldn't matter anyway, as such a manipulation of the image data would likely corrupt the integrity of the image beyond acceptable quality levels.

It shouldn't differ from converting over (from/to) limited BT.709. No color shift should occur. In terms of precision loss certainly 10-bit limited can hold less values than 10-bit full range (not talking about 12-bits) but I'd there should be at most -/+1 sample difference.

MartinPulec commented 20 hours ago

also, would the new Vulkan encoder capabilities in FFMPEG 7.1 overcome this issue while still providing hardware acceleration?

Hard to tell, I think it has the same backend. But there may be some difference in the workflow (mainly if RGB won't be converted). Nevertheless, Vulkan encode cannot be used with UltraGrid right now (it would require some effort).