haasn / libplacebo

Official mirror of libplacebo
GNU Lesser General Public License v2.1
522 stars 63 forks source link

Rendering zero copy HDR AVFrames decoded via Vulkan #272

Open gitoss opened 1 week ago

gitoss commented 1 week ago

This issue is very similar to the nVidia-only ticket https://github.com/haasn/libplacebo/issues/237 - i.e. it's the same combination of hdr source and Vulkan decoding.

I'm on AMD Ryzen 7540U w/ RDNA3 740M, Driver 23.11.1 (newer Adrenaline versions have a horrible memory leak bug).

ffmpeg -hwaccel vulkan -an -i %INPUT% -vf "libplacebo=tonemapping=hable:format=nv12,hwdownload,format=nv12" -f null - -benchmark

This is significantly SLOWER on newer ffmpeg versions than on old ones. I tried to track when the change occurs - it was beween 2023-04-30 and 2023-05-31 becuse it works up to "ffmpeg version N-111869-g7aa71ab5c0-20230831'. https://github.com/BtbN/FFmpeg-Builds/releases

Vulkan decoding (i.e. zero copy) is 2x fps of software decoding. Hardware decode with dxva2/d3d11va w/ hwupload is only 1.5 fps of software decoding.

Btw adding '-hwaccel_output_format vulkan' is only acceped up to this older ffmpeg versions, on newer versions like 'ffmpeg version N-111869-g7aa71ab5c0-20230831' thse errors occur:

_[libplacebo @ 000000001d8ee4c0] Masking sampleable from wrapped texture because the corresponding format 'rx10' does not support PL_FMT_CAP_SAMPLEABLE [libplacebo @ 000000001d8ee4c0] Masking blit_src from wrapped texture because the corresponding format 'rx10' does not support PL_FMT_CAP_BLITTABLE [libplacebo @ 000000001d8ee4c0] Masking sampleable from wrapped texture because the corresponding format 'rxgx10' does not support PL_FMT_CAP_SAMPLEABLE [libplacebo @ 000000001d8ee4c0] Masking blit_src from wrapped texture because the corresponding format 'rxgx10' does not support PL_FMT_CAP_BLITTABLE [libplacebo @ 000000001d8ee4c0] Masking sampleable from wrapped texture because the corresponding format 'rx10' does not support PL_FMT_CAP_SAMPLEABLE [libplacebo @ 000000001d8ee4c0] Masking blit_src from wrapped texture because the corresponding format 'rx10' does not support PL_FMT_CAP_BLITTABLE [libplacebo @ 000000001d8ee4c0] Masking sampleable from wrapped texture because the corresponding format 'rxgx10' does not support PL_FMT_CAP_SAMPLEABLE [libplacebo @ 000000001d8ee4c0] Masking blit_src from wrapped texture because the corresponding format 'rxgx10' does not support PL_FMT_CAP_BLITTABLE [libplacebo @ 000000001d8ee4c0] Validation failed: (image->planes[i]).texture->params.sampleable (src/renderer.c:2704) [libplacebo @ 000000001d8ee4c0] Backtrace: [libplacebo @ 000000001d8ee4c0] #0 0x7ff66c7a7b9f in FT_Set_Default_Log_Handler+0xaa9df (ffmpeg.exe+0x14e7b9f) (0x1414e7b9f) [libplacebo @ 000000001d8ee4c0] #1 0x7ff66c7ab6c8 in pl_render_image+0xa8 (ffmpeg.exe+0x14eb6c8) (0x1414eb6c8) [libplacebo @ 000000001d8ee4c0] #2 0x7ff66c7acb63 in pl_render_image_mix+0x1383 (ffmpeg.exe+0x14ecb63) (0x1414ecb63) [libplacebo @ 000000001d8ee4c0] #3 0x7ff66b42173c (ffmpeg.exe+0x16173c) (0x14016173c) [libplacebo @ 000000001d8ee4c0] #4 0x7ff66b42269a (ffmpeg.exe+0x16269a) (0x14016269a) [libplacebo @ 000000001d8ee4c0] #5 0x7ff66b32888b (ffmpeg.exe+0x6888b) (0x14006888b) [libplacebo @ 000000001d8ee4c0] #6 0x7ff66b32d62f (ffmpeg.exe+0x6d62f) (0x14006d62f) [libplacebo @ 000000001d8ee4c0] #7 0x7ff66b2d338f (ffmpeg.exe+0x1338f) (0x14001338f) [libplacebo @ 000000001d8ee4c0] #8 0x7ff66b2ec158 (ffmpeg.exe+0x2c158) (0x14002c158) [libplacebo @ 000000001d8ee4c0] #9 0x7ff66cf924ca in FT_Get_PS_FontValue+0x8229a (ffmpeg.exe+0x1cd24ca) (0x141cd24ca) [libplacebo @ 000000001d8ee4c0] #10 0x7ff83eefe633 in beginthreadex+0x133 (C:\Windows\System32\msvcrt.dll+0x3e633) (0x11013e633) [libplacebo @ 000000001d8ee4c0] #11 0x7ff83eefe70b in endthreadex+0xab (C:\Windows\System32\msvcrt.dll+0x3e70b) (0x11013e70b) [libplacebo @ 000000001d8ee4c0] #12 0x7ff83e19257c in BaseThreadInitThunk+0x1c (C:\Windows\System32\KERNEL32.DLL+0x1257c) (0x18001257c) [libplacebo @ 000000001d8ee4c0] #13 0x7ff83ff8aa47 in RtlUserThreadStart+0x27 (C:\Windows\SYSTEM32\ntdll.dll+0x5aa47) (0x18005aa47)

haasn commented 1 week ago

cc @cyanreg is there a way to control whether or not to use planar formats?

gitoss commented 1 week ago

For what its worth, I've even tried to get dxva2 and d3dv11va hw decoding to work w/o cpu path to libplacebo - both fail (-extra_hw_frames doesn't help). The new d3d12va which is supposed to be more zero-copy ready is completely broken on my system w/ current ffmpeg.

ffmpeg -init_hw_device "vulkan=vk:0" -filter_hw_device vk -hwaccel d3d11va -hwaccel_output_format d3d11 -an -i %INPUT% -vf "hwmap=derive_device=vk,format=vulkan,libplacebo=tonemapping=hable:format=nv12,hwdownload,format=nv12" -f null - -benchmark

It seems the only way to go for zero copy with libplacebo seems to be Vulkan decoding.

cyanreg commented 1 week ago

I saw a lot of text, but no issue stated. What is wrong?

@haasn -init_hw_device "vulkan=vk:0,disable_multiplane=1"

gitoss commented 1 week ago

I saw a lot of text, but no issue stated. What is wrong?

@haasn -init_hw_device "vulkan=vk:0,disable_multiplane=1"

"This is significantly SLOWER on newer ffmpeg versions than on old ones."

cyanreg commented 1 week ago

Vulkan decoding wasn't even merged for the command line you posted to work.

gitoss commented 1 week ago

Vulkan decoding wasn't even merged for the command line you posted to work.

Could you please elaborate?

The command line I specified is simplified from what I use in libplacebo tonemapping, but I comared it in current and an old ffmpeg version (versions in the op) - and there's a signficiant speed difference.

This seems to be something not ony I expericiencd reading the issues here around the time Vukan ffmpeg moved to 1.3