Closed oviano closed 1 year ago
@xhaihao : fyi, please, comment
I posted a comment to ffmpeg ticket. Repeating it here: "This seems to be an issues a fix for which we have been waiting for awhile to actually land QSV AV1 encoding support in ffmpeg master branch. Please, provide details on the underlying stack, i.e. which version of media-driver, oneVPL, and most importantly oneVPL-intel-gpu are you using?
I hope that you might be missing the following fix in oneVPL-gpu runtime: https://github.com/oneapi-src/oneVPL-intel-gpu/commit/dc7fd15 which is available starting from intel-onevpl-22.6.0."
I also think that filing to https://github.com/oneapi-src/oneVPL-intel-gpu might be more appropriate since this is coming from runtime not dispatching library.
As far as I have figured out, the fix was made here in oneVPL-intel-gpu:
https://github.com/oneapi-src/oneVPL-intel-gpu/issues/206
But that was back in April, so it would be great if us Windows users could get access to a driver containing this fix so we can use the AV1 encode capability properly.
@oviano moving this to move this to oneVPL-intel-gpu
I can't reproduce this issue on Linux. It should be a Windows driver related issue.
Implementation details:
ApiVersion: 2.8
Implementation type: HW
AccelerationMode via: VAAPI
Path: /usr/lib/x86_64-linux-gnu/libmfx-gen.so.1.2.8
Encoding out.nv12 -> out.av1 Input colorspace: NV12 bitsream.DecodeTimeStamp = 0 bitsream.TimeStamp = 0 bitsream.DecodeTimeStamp = 3000 bitsream.TimeStamp = 3000 bitsream.DecodeTimeStamp = 6000 bitsream.TimeStamp = 6000 bitsream.DecodeTimeStamp = 9000 bitsream.TimeStamp = 9000 bitsream.DecodeTimeStamp = 12000 bitsream.TimeStamp = 12000 bitsream.DecodeTimeStamp = 15000 bitsream.TimeStamp = 15000 bitsream.DecodeTimeStamp = 18000 bitsream.TimeStamp = 18000 bitsream.DecodeTimeStamp = 21000 bitsream.TimeStamp = 21000 bitsream.DecodeTimeStamp = 24000 bitsream.TimeStamp = 24000 ...
The fix for AV1 encoder timestamp issue doesn't exist in Windows Arc driver 31.0.101.3802.
HandBrake uses ffmpeg libs too, so it is also affected.
Any ETA we can get a new Windows driver release for this issue?
The first Arc GPUs had been launched in late March and this bug exists from the beginning, yes I tried older drivers. 8 months later it's not fixed. This is not acceptable.
AV1 was one of the big major points for Arc dGPUs. Apparently it's fixed since April for Linux, maybe Windows is not a top priority and they just doesn't care or didn't properly test it. rigaya from QSVEnc fixed it in a couple of days with a workaround, you would think Intel could do the same. Handbrake nightly suffers from the same root cause from the beginning.
I wouldn't bet on the new driver. Sure the fix is ready but Intel doesn't seem to care, I mean they could have released a bugfix driver a long time ago if they really anted to release it asap. There seems to be a big delay until a fix is shipped into the public driver.
According to this response to my post in the community forum, the driver seems unlikely before January.
This is a big fail, isn't it? There is not much use of Arc AV1 on Windows with one exception which is QSVEnc. Handbrake AV1 is completely broken and ffmpeg only works when bframes are off. It's not like Arc just launched, as I said the first version of Arc is in public since 8 months.
It's a bit of a missed opportunity in my view.
Currently something like the A380 should be really tempting for encoder/streamers as it's the only dual-slot hardware accelerated AV1 solution until NVIDIA release cheaper/smaller 4-series cards next year. Yet they're letting the whole thing down by only taking the Linux drivers seriously, from what I can tell.
(Having said that, I've been using my FFmpeg patch a lot these last few days and it does work just fine, so there is a workaround which would also apply to Handbrake, if you wanted it)
Can you send me your patched ffmpeg?
Thanks, I will try it out when I have time.
Only caveat is that there may be other issues with the Windows driver, but I'm not 100% sure yet:
By the way is there an AV1 option or profile where I can choose between 8 bit or 10 bit? I haven't found.
If you mean encoding an 8 bit source as 10 bit, I think you would need to convert it to 10 bit first using something like -vf qsv_scale=format=p010 in the FFmpeg command line. This would then pass each frame as 10 bit and it would get encoded as 10 bit.
I think NVENC AV1 might have an explicit option to encode an 8 bit source using 10 bit precision, without converting it first, but haven't seen the same for QSV.
With QSVEnc I have output-depth 8 and output-depth 10, Handbrake also offers both 8bit and 10bit encoding regardless of the input source. There is no such option for ffmpeg.
Yes, that's correct. I think there are a few missing options in Intel's FFmpeg AV1 implementation. Another one is "scenario".
By the way, even with my timestamp FFmpeg patch, I'm struggling to get QSV to encode in AV1 with a quality that isn't slightly worse than HEVC.
I've been posting my results in this thread:
https://github.com/oneapi-src/oneVPL/issues/79
I do not know if this is a Windows-driver-specific thing, or whether the Linux AV1 QSV encodes are sub-par too because I don't have a Linux machine to try out equivalent commands.
My tests with an NVENC 4090 showed a significant Improvement in quality under AV1 vs HEVC on that card.
@xhaihao : did you add/verify 10-bit support for AV1 ffmpeg-qsv encoder? If yes, can you, please, provide sample cmdline and answer above question rgt. profile (which is whether any special profile for AV1 is needed to encode 10-bit with av1 qsv encoder).
By the way, even with my timestamp FFmpeg patch, I'm struggling to get QSV to encode in AV1 with a quality that isn't slightly worse than HEVC.
I've been posting my results in this thread:
I do not know if this is a Windows-driver-specific thing, or whether the Linux AV1 QSV encodes are sub-par too because I don't have a Linux machine to try out equivalent commands.
My tests with an NVENC 4090 showed a significant Improvement in quality under AV1 vs HEVC on that card.
From my tests the AV1 scores are fine with PSNR and SSIM, VMAF scores however are quite a bit below HEVC in most cases. Sometimes bframes 1 or 3 give me higher VMAF scores but from my subjective I would prefer bframes 7 (dist 8). Not sure if VMAF is that great because there is no croma component, it's luma component only. I wouldn't blindly trust VMAF over other metrics. 3-component SSIM works quite good imho for Quicksync/x265, to me it's closer to my subjective testing.
Rigaya tested Arc and RTX 4080, in his tests 10 bit AV1 is quite a bit better than 8 bit on Arc. Does ffmpeg only support 8 bit AV1 right now?
FFmpeg NVENC AV1 has this option:
{ "highbitdepth", "Enable 10 bit encode for 8 bit input",OFFSET(highbitdepth),AV_OPT_TYPE_BOOL, { .i64 = 0 }, 0, 1, VE },
There isn't an explicit option like this for FFmpeg QSVENC, but as you can see, @dvrogozh has asked for clarification as to how it works in QSVENC (maybe specifying profile main10 is enough, for example, who knows).
I don't know about the other FFmpeg AV1 encoders (aomenc, svt-av1, rav1e etc)....
My other problem is the overall stability of the Windows driver. On both machines I am currently testing, it frequently crashes the whole PC and so far I've only managed to make it produce one memory dump file. I'm dealing with that elsewhere in the Intel support forums, but I'm at the point of giving up on the Windows version of QSVENC ARC, especially when I read things from Intel engineers responding to the timestamp issue with "don't expect a driver release before January"....well in that case maybe I'll just RMA these devices and check back in 6 months....
3802 is unstable, it's crashing for me too. 3491 or 3490 was more reliable.
3802 is unstable, it's crashing for me too. 3491 or 3490 was more reliable.
Right, that's useful to know. I'll try going back to an earlier driver. It's all a bit of a sorry state of affairs though tbh. I really want to like these devices, the hardware seems really nice....but I don't know if I arrived late to the NVIDIA cards when they'd already gone through this process, but I've never had any issues like the numerous ones I've had with these cards.
According to AV1 spec, 10bit and 8bit 4:2:0 share the same profile.
seq_profile Bit depth Monochrome support Chroma subsampling
0 8 or 10 Yes YUV 4:2:0
So we can't use profile to distinguish 10bit from 8bit.
It will use 10bit encoding if the input is 10bit
$ ffmpeg -y -f lavfi -i testsrc -vf "format=p010" -c:v av1_qsv -vframes 100 out.mp4 ... Output #0, mp4, to 'out.mp4': Metadata: encoder : Lavf59.34.102 Stream #0:0: Video: av1 (av01 / 0x31307661), p010le(tv, progressive), 320x240 [SAR 1:1 DAR 4:3], q=2-31, 1000 kb/s, 25 fps, 12800 tbn Metadata: encoder : Lavc59.54.100 av1_qsv ...
The output is 8bit if the input is 8bit
$ ffmpeg -y -f lavfi -i testsrc -vf "format=nv12" -c:v av1_qsv -vframes 100 out.mp4 ... Output #0, mp4, to 'out.mp4': Metadata: encoder : Lavf59.34.102 Stream #0:0: Video: av1 (av01 / 0x31307661), nv12(tv, progressive), 320x240 [SAR 1:1 DAR 4:3], q=2-31, 1000 kb/s, 25 fps, 12800 tbn Metadata: encoder : Lavc59.54.100 av1_qsv
According to AV1 spec, 10bit and 8bit 4:2:0 share the same profile.
seq_profile Bit depth Monochrome support Chroma subsampling 0 8 or 10 Yes YUV 4:2:0
So we can't use profile to distinguish 10bit from 8bit.
It will use 10bit encoding if the input is 10bit
$ ffmpeg -y -f lavfi -i testsrc -vf "format=p010" -c:v av1_qsv -vframes 100 out.mp4 ... Output #0, mp4, to 'out.mp4': Metadata: encoder : Lavf59.34.102 Stream #0:0: Video: av1 (av01 / 0x31307661), p010le(tv, progressive), 320x240 [SAR 1:1 DAR 4:3], q=2-31, 1000 kb/s, 25 fps, 12800 tbn Metadata: encoder : Lavc59.54.100 av1_qsv ...
In the OneVPL API there is TargetBitdepthLuma and TargetBitdepthChroma settings (extco3) - are these meant to convert an 8 bit input to 10 bit? I tried enabling these options in my local FFmpeg build, but even if they do the conversion I couldn't see how the metadata would then be propagated to the container in FFmpeg as the AVCodecContext would still have the input pixel format which would then get written to the stream/container, so maybe these options are unsuitable for FFmpeg-qsv.
Obviously you can use a filter as you have shown.
In the OneVPL API there is TargetBitdepthLuma and TargetBitdepthChroma settings (extco3) - are these meant to convert an 8 bit input to 10 bit?
These are meant to encode with a different colorspace other than what is on thy input. The idea was to support cases such as 12bit -> 10 bit or 10bit -> 8bit without including video processing explicitly (i.e. as a stand alone component). I don't think that 8bit -> 10bit was the intended use case at all.
I am not sure why you are trying to look into that considering that @xhaihao clarified that AV1 ffmpeg-qsv does support 10-bit input without the need for the special profile? If you have 10-bit input, you should get 10-bit encoded bitstream. That was the question, right?
If the intend is to encode 8bit from the 10bit input, I would suggest to add explicit color conversion before the encoder. We can ask @xhaihao to help with the command line for that. This will be a much better tested path versus TargetBitdepthLuma/TargetBitdepthChroma. Besides, TargetBitdepthLuma and TargetBitdepthChroma are not supported on ffmpeg level as of now.
In the OneVPL API there is TargetBitdepthLuma and TargetBitdepthChroma settings (extco3) - are these meant to convert an 8 bit input to 10 bit?
I am not sure why you are trying to look into that considering that @xhaihao clarified that AV1 ffmpeg-qsv does support 10-bit input without the need for the special profile? If you have 10-bit input, you should get 10-bit encoded bitstream. That was the question, right?
I looked into it before @xhaihao's response. Just curious! Thanks for the explanation.
The timestamp issue seems to be fixed in the new driver.
That's great. Any improvement in stability for you?
No issue so far but I haven't tried much, too early to say. This is the first driver with API 2.08 support by the way. The stutter issue in Handbrake is also gone.
Closing this now as the issue is resolved with the new driver.
Was about to let you guys know that new beta driver was posted, but you figured that out yourselves. Just confirming that I double checked internally and fix for the issue I was talking above (https://github.com/oneapi-src/oneVPL-intel-gpu/commit/dc7fd15 in terms of open source vpl runtime) is actually included in this driver (31.0.101.3959).
Thanks 🙏
I modified hello_encode.cpp to output an AV1 bitstream instead and discovered that the timestamps produced by the encoder are incorrect.
For AV1, PTS and DTS should be the same since there is no re-ordering. I am using an ARC A770 with the latest driver 3820, and oneVPL 2022.2.5 release.
I am trying to solve this bug in FFmpeg, caused by the same issue:
https://trac.ffmpeg.org/ticket/10062
Any help appreciated.
Below is my modified file:
... and here is the output produced ...