haasn / libplacebo

Official mirror of libplacebo
http://libplacebo.org/
GNU Lesser General Public License v2.1
524 stars 63 forks source link

FFMpeg compatibility - A follow-up #172

Closed Chipcraft closed 1 year ago

Chipcraft commented 1 year ago

As requested by kasper93, here is a follow-up (https://github.com/haasn/libplacebo/issues/170) on the current FFMpeg - libplacebo issues.

First of all, the issues are hardware agnostic.

Vulkan Hardware Capability Viewer 3.30 (Features > Core 1.3)

AMD: dynamicRendering = True Intel: dynamicRendering = True NVIDIA: dynamicRendering = True

///////////////////////////////////////////

So, the situation with the current FFMpeg master e8e486332571347dd55822c842ba67276ac308e2 with libplacebo master 20d63f7335174cb54eb3582f406b4843771760e6:

Libplacebo:

1) meson setup --prefix=/home/Username/ExtLibs --buildtype=release -Dtests=true --default-library=shared BuildFolder 2) ninja -C BuildFolder install 3) ninja -C BuildFolder test

 1/13 colorspace.c                 OK              0.07s
 2/13 common.c                     OK              0.05s
 3/13 filters.c                    OK              0.05s
 4/13 string.c                     OK              0.04s
 5/13 tone_mapping.c               OK              0.03s
 6/13 utils.c                      OK              0.03s
 7/13 icc.c                        OK              0.02s
 8/13 dav1d.c                      OK              0.02s
 9/13 dither.c                     OK              0.05s
10/13 opengl_surfaceless.c         SKIP            0.02s   exit status 77
11/13 lut.c                        OK              0.05s
12/13 dummy.c                      OK              0.10s
13/13 d3d11.c                      OK             18.62s

FFMpeg:

1) PKG_CONFIG_PATH=/home/Username/ExtLibs/lib/pkgconfig ./configure --enable-nvenc --enable-libplacebo --extra-libs="-lstdc++" 2) make -j 64 3) ffmpeg -i Input.mkv -init_hw_device vulkan -vf libplacebo=downscaler=lanczos:w=iw/2:h=ih/2 -frames:v 1 -update 1 Output.png

[libplacebo @ 0000025f43ff08c0] Missing device feature: dynamicRendering
[libplacebo @ 0000025f43ff08c0] Imported Vulkan device was not created with all required features!
[libplacebo @ 0000025f43ff08c0] Failed importing vulkan device
[libplacebo @ 0000025f43ff08c0] Failed importing Vulkan device!
[Parsed_libplacebo_0 @ 0000025f4548ef00] Query format failed for 'Parsed_libplacebo_0': Generic error in an external library
Error reinitializing filters!
Failed to inject frame into filter network: Generic error in an external library
Conversion failed!

With libplacebo at > cedacbfbc96c2dbc9ccba0cda8b2392d618d1fc0 && <= fd20dba8435a0d16430bf90d45be3a43aaae1a01 the emitted error is:

[libplacebo @ 0000017d0d2af540] Imported Vulkan device was not created with all required features!
[libplacebo @ 0000017d0d2af540] Failed importing vulkan device
[libplacebo @ 0000017d0d2af540] Failed importing Vulkan device!
[Parsed_libplacebo_0 @ 0000017d0ebdef00] Query format failed for 'Parsed_libplacebo_0': Generic error in an external library
Error reinitializing filters!
Failed to inject frame into filter network: Generic error in an external library
Conversion failed!

And the current error is introduced with the libplacebo at >= 9f35ff1ad6a95245659a0935d7a20a3543814838 && == 20d63f7335174cb54eb3582f406b4843771760e6 (master).

[libplacebo @ 0000025f43ff08c0] Missing device feature: dynamicRendering
[libplacebo @ 0000025f43ff08c0] Imported Vulkan device was not created with all required features!
[libplacebo @ 0000025f43ff08c0] Failed importing vulkan device
[libplacebo @ 0000025f43ff08c0] Failed importing Vulkan device!
[Parsed_libplacebo_0 @ 0000025f4548ef00] Query format failed for 'Parsed_libplacebo_0': Generic error in an external library
Error reinitializing filters!
Failed to inject frame into filter network: Generic error in an external library
Conversion failed!

The error is the same, regardless of the functionality to be used (e.g., downscaler, upscaler, tonemapping, etc.).

Based on testing, cedacbfbc96c2dbc9ccba0cda8b2392d618d1fc0 is the last commit, which works without Vulkan 1.3 being available, despite commit comments of 1e0b01ea4ad9b0010a3389c2d2894244956b854e do state that the requirement for Vulkan 1.3 has been reverted. Apparently, the actual issue is that even the current master of FFMpeg is limited to Vulkan 1.2 https://github.com/haasn/libplacebo/commit/cfd87b21f18e52729296f866369ead486f1ce10f#commitcomment-110290665. That being said, I haven't been able to find any patches for FFMpeg which would get rid of this limitation.

cedacbfbc96c2dbc9ccba0cda8b2392d618d1fc0 can be used with the current FFMpeg master e8e486332571347dd55822c842ba67276ac308e2 however, that seems to result in a memory leak. Functions like downscaler / upscaler will generally work however, tone mapping for example will fail as soon as it runs out of memory. Also, all of GPUs dedicated VRAM is allocated from the get-go, while normally the allocation is couple of GBs.

For example on the 4070 Ti, with 12GB of VRAM, tone mapping function (-init_hw_device vulkan -vf libplacebo=colorspace=bt709:color_primaries=bt709:color_trc=bt709:tonemapping=hable), after couple of minutes of running:

[libplacebo @ 000001bc28406f40] Allocation of size  256M failed: VK_ERROR_OUT_OF_DEVICE_MEMORY!
[libplacebo @ 000001bc28406f40] Memory heaps supported by device:
[libplacebo @ 000001bc28406f40]     0: flags 0x1 size   11G
[libplacebo @ 000001bc28406f40]     1: flags 0x0 size   15G
[libplacebo @ 000001bc28406f40] Memory pool 0:
[libplacebo @ 000001bc28406f40]     Compatible types: 0x3
[libplacebo @ 000001bc28406f40]     Optimal flags: 0x1
[libplacebo @ 000001bc28406f40]     Slab  0:        0 x   14M:   35M used   56M res   56M alloc from heap 0, efficiency 63.33%  [../src/utils/upload.c:245]
[libplacebo @ 000001bc28406f40]     Slab  1:        0 x 3840K:   30M used   30M res   30M alloc from heap 0, efficiency 100.00%  [../src/utils/upload.c:245]
[libplacebo @ 000001bc28406f40]     Slab  2: fffffffffffffff0 x  4096:   16K used   16K res  256K alloc from heap 0, efficiency 100.00%  [../src/shaders/lut.c:500]
[libplacebo @ 000001bc28406f40]     Slab  3: fffffffe x  8192:  8192 used  8192 res  256K alloc from heap 0, efficiency 100.00%  [../src/shaders/lut.c:500]
[libplacebo @ 000001bc28406f40]     Slab  4:        e x  864K:  864K used  864K res 3456K alloc from heap 0, efficiency 100.00%  [../src/shaders/lut.c:500]
[libplacebo @ 000001bc28406f40]     Slab  5:     fffe x   16K:   16K used   16K res  256K alloc from heap 0, efficiency 100.00%  [../src/shaders/lut.c:500]
[libplacebo @ 000001bc28406f40]     Slab  6:        0 x   14M:   81M used  112M res  112M alloc from heap 0, efficiency 72.50%  [../src/utils/upload.c:245]
[libplacebo @ 000001bc28406f40]     Slab  7:        0 x 3840K:  120M used  120M res  120M alloc from heap 0, efficiency 100.00%  [../src/utils/upload.c:245]
[libplacebo @ 000001bc28406f40]     Slab  8:        0 x   14M:  225M used  225M res  225M alloc from heap 0, efficiency 100.00%  [../src/utils/upload.c:245]
[libplacebo @ 000001bc28406f40]     Slab  9:        0 x 3840K:  240M used  240M res  240M alloc from heap 0, efficiency 100.00%  [../src/utils/upload.c:245]
[libplacebo @ 000001bc28406f40]     Slab 10:        0 x   14M:  253M used  256M res  256M alloc from heap 0, efficiency 98.88%  [../src/utils/upload.c:245]
[libplacebo @ 000001bc28406f40]     Slab 11:        0 x   14M:  222M used  256M res  256M alloc from heap 0, efficiency 86.79%  [../src/utils/upload.c:245]
[libplacebo @ 000001bc28406f40]     Slab 12:        0 x   14M:  129M used  256M res  256M alloc from heap 0, efficiency 50.54%  [../src/utils/upload.c:245]
[libplacebo @ 000001bc28406f40]     Slab 13:        0 x   14M:  129M used  256M res  256M alloc from heap 0, efficiency 50.54%  [../src/utils/upload.c:245]
[libplacebo @ 000001bc28406f40]     Slab 14:        0 x   14M:  129M used  256M res  256M alloc from heap 0, efficiency 50.54%  [../src/utils/upload.c:245]
[libplacebo @ 000001bc28406f40]     Slab 15:        0 x   14M:  129M used  256M res  256M alloc from heap 0, efficiency 50.54%  [../src/utils/upload.c:245]
[libplacebo @ 000001bc28406f40]     Slab 16:        0 x   14M:  129M used  256M res  256M alloc from heap 0, efficiency 50.54%  [../src/utils/upload.c:245]
[libplacebo @ 000001bc28406f40]     Slab 17:        0 x   14M:  129M used  256M res  256M alloc from heap 0, efficiency 50.54%  [../src/utils/upload.c:245]
[libplacebo @ 000001bc28406f40]     Slab 18:        0 x   14M:  129M used  256M res  256M alloc from heap 0, efficiency 50.54%  [../src/utils/upload.c:245]
[libplacebo @ 000001bc28406f40]     Slab 19:        0 x   14M:  129M used  256M res  256M alloc from heap 0, efficiency 50.54%  [../src/utils/upload.c:245]
[libplacebo @ 000001bc28406f40]     Slab 20:        0 x   14M:  129M used  256M res  256M alloc from heap 0, efficiency 50.54%  [../src/utils/upload.c:245]
[libplacebo @ 000001bc28406f40]     Slab 21:        0 x   14M:  129M used  256M res  256M alloc from heap 0, efficiency 50.54%  [../src/utils/upload.c:245]
[libplacebo @ 000001bc28406f40]     Slab 22:        0 x   14M:  129M used  256M res  256M alloc from heap 0, efficiency 50.54%  [../src/utils/upload.c:245]
[libplacebo @ 000001bc28406f40]     Slab 23:        0 x   14M:  129M used  256M res  256M alloc from heap 0, efficiency 50.54%  [../src/utils/upload.c:245]
[libplacebo @ 000001bc28406f40]     Slab 24:        0 x   14M:  129M used  256M res  256M alloc from heap 0, efficiency 50.54%  [../src/utils/upload.c:245]
[libplacebo @ 000001bc28406f40]     Slab 25:        0 x   14M:  129M used  256M res  256M alloc from heap 0, efficiency 50.54%  [../src/utils/upload.c:245]
[libplacebo @ 000001bc28406f40]     Slab 26:        0 x   14M:  129M used  256M res  256M alloc from heap 0, efficiency 50.54%  [../src/utils/upload.c:245]
[libplacebo @ 000001bc28406f40]     Slab 27:        0 x   14M:  129M used  256M res  256M alloc from heap 0, efficiency 50.54%  [../src/utils/upload.c:245]
[libplacebo @ 000001bc28406f40]     Slab 28:        0 x   14M:  129M used  256M res  256M alloc from heap 0, efficiency 50.54%  [../src/utils/upload.c:245]
[libplacebo @ 000001bc28406f40]     Slab 29:        0 x   14M:  129M used  256M res  256M alloc from heap 0, efficiency 50.54%  [../src/utils/upload.c:245]
[libplacebo @ 000001bc28406f40]     Slab 30:        0 x   14M:  129M used  256M res  256M alloc from heap 0, efficiency 50.54%  [../src/utils/upload.c:245]
[libplacebo @ 000001bc28406f40]     Slab 31:        0 x   14M:  129M used  256M res  256M alloc from heap 0, efficiency 50.54%  [../src/utils/upload.c:245]
[libplacebo @ 000001bc28406f40]     Slab 32:        0 x   14M:  129M used  256M res  256M alloc from heap 0, efficiency 50.54%  [../src/utils/upload.c:245]
[libplacebo @ 000001bc28406f40]     Slab 33:        0 x   14M:  129M used  256M res  256M alloc from heap 0, efficiency 50.54%  [../src/utils/upload.c:245]
[libplacebo @ 000001bc28406f40]     Slab 34:        0 x   14M:  129M used  256M res  256M alloc from heap 0, efficiency 50.54%  [../src/utils/upload.c:245]
[libplacebo @ 000001bc28406f40]     Slab 35:        0 x   14M:  129M used  256M res  256M alloc from heap 0, efficiency 50.54%  [../src/utils/upload.c:245]
[libplacebo @ 000001bc28406f40]     Slab 36:        0 x   14M:  129M used  256M res  256M alloc from heap 0, efficiency 50.54%  [../src/utils/upload.c:245]
[libplacebo @ 000001bc28406f40]     Slab 37:        0 x   14M:  129M used  256M res  256M alloc from heap 0, efficiency 50.54%  [../src/utils/upload.c:245]
[libplacebo @ 000001bc28406f40]     Slab 38:        0 x   14M:  129M used  256M res  256M alloc from heap 0, efficiency 50.54%  [../src/utils/upload.c:245]
[libplacebo @ 000001bc28406f40]     Slab 39:        0 x   14M:  129M used  256M res  256M alloc from heap 0, efficiency 50.54%  [../src/utils/upload.c:245]
[libplacebo @ 000001bc28406f40]     Slab 40:        0 x   14M:  129M used  256M res  256M alloc from heap 0, efficiency 50.54%  [../src/utils/upload.c:245]
[libplacebo @ 000001bc28406f40]     Slab 41:        0 x   14M:  129M used  256M res  256M alloc from heap 0, efficiency 50.54%  [../src/utils/upload.c:245]
[libplacebo @ 000001bc28406f40]     Slab 42:        0 x   14M:  129M used  256M res  256M alloc from heap 0, efficiency 50.54%  [../src/utils/upload.c:245]
[libplacebo @ 000001bc28406f40]     Slab 43:        0 x   14M:  129M used  256M res  256M alloc from heap 0, efficiency 50.54%  [../src/utils/upload.c:245]
[libplacebo @ 000001bc28406f40]     Slab 44:        0 x   14M:  129M used  256M res  256M alloc from heap 0, efficiency 50.54%  [../src/utils/upload.c:245]
[libplacebo @ 000001bc28406f40]     Slab 45:        0 x   14M:  129M used  256M res  256M alloc from heap 0, efficiency 50.54%  [../src/utils/upload.c:245]
[libplacebo @ 000001bc28406f40]     Slab 46:        0 x   14M:  129M used  256M res  256M alloc from heap 0, efficiency 50.54%  [../src/utils/upload.c:245]
[libplacebo @ 000001bc28406f40]     Slab 47:        0 x   14M:  129M used  256M res  256M alloc from heap 0, efficiency 50.54%  [../src/utils/upload.c:245]
[libplacebo @ 000001bc28406f40]     Slab 48:        0 x   14M:  129M used  256M res  256M alloc from heap 0, efficiency 50.54%  [../src/utils/upload.c:245]
[libplacebo @ 000001bc28406f40]     Slab 49:        0 x   14M:  129M used  256M res  256M alloc from heap 0, efficiency 50.54%  [../src/utils/upload.c:245]
[libplacebo @ 000001bc28406f40]     Slab 50:        0 x   14M:  129M used  256M res  256M alloc from heap 0, efficiency 50.54%  [../src/utils/upload.c:245]
[libplacebo @ 000001bc28406f40]     Slab 51:        0 x   14M:  129M used  256M res  256M alloc from heap 0, efficiency 50.54%  [../src/utils/upload.c:245]
[libplacebo @ 000001bc28406f40]     Pool summary: 6383M used   11G res   11G alloc, efficiency 55.33%, utilization 99.97%
[libplacebo @ 000001bc28406f40] Memory pool 1:
[libplacebo @ 000001bc28406f40]     Compatible types: 0xffffffff
[libplacebo @ 000001bc28406f40]     Required flags: 0x1
[libplacebo @ 000001bc28406f40]     Optimal flags: 0x2
[libplacebo @ 000001bc28406f40]     Buffer flags: 0xc3
[libplacebo @ 000001bc28406f40]     Slab  0: 3ffffffffff x  6144:     0 used  4096 res  256K alloc from heap 0, efficiency 0.00%  [../src/gpu/utils.c:1093]
[libplacebo @ 000001bc28406f40]     Pool summary:     0 used  4096 res  256K alloc, efficiency 0.00%, utilization 1.56%
[libplacebo @ 000001bc28406f40] Memory pool 2:
[libplacebo @ 000001bc28406f40]     Compatible types: 0xffffffff
[libplacebo @ 000001bc28406f40]     Optimal flags: 0x3
[libplacebo @ 000001bc28406f40]     Buffer flags: 0x3
[libplacebo @ 000001bc28406f40]     Slab  0: 3ffffffffff x  6144:     0 used  4096 res  256K alloc from heap 0, efficiency 0.00%  [../src/gpu/utils.c:471]
[libplacebo @ 000001bc28406f40]     Slab  1:        f x  585K:     0 used     0 res 2340K alloc from heap 0, efficiency 100.00%  [../src/gpu/utils.c:471]
[libplacebo @ 000001bc28406f40]     Slab  2:     3fff x   18K:     0 used  4096 res  256K alloc from heap 0, efficiency 0.00%  [../src/gpu/utils.c:471]
[libplacebo @ 000001bc28406f40]     Pool summary:     0 used  8192 res 2852K alloc, efficiency 0.00%, utilization 0.28%
[libplacebo @ 000001bc28406f40] Memory pool 3:
[libplacebo @ 000001bc28406f40]     Compatible types: 0xffffffff
[libplacebo @ 000001bc28406f40]     Required flags: 0x3
[libplacebo @ 000001bc28406f40]     Optimal flags: 0x2
[libplacebo @ 000001bc28406f40]     Buffer flags: 0x23
[libplacebo @ 000001bc28406f40]     Slab  0: 3ffffffffff x  6144:     0 used  4096 res  256K alloc from heap 0, efficiency 0.00%  [../src/shaders/colorspace.c:1197]
[libplacebo @ 000001bc28406f40]     Pool summary:     0 used  4096 res  256K alloc, efficiency 0.00%, utilization 1.56%
[libplacebo @ 000001bc28406f40] Memory pool 4:
[libplacebo @ 000001bc28406f40]     Compatible types: 0xffffffff
[libplacebo @ 000001bc28406f40]     Required flags: 0x1
[libplacebo @ 000001bc28406f40]     Optimal flags: 0x2
[libplacebo @ 000001bc28406f40]     Buffer flags: 0x13
[libplacebo @ 000001bc28406f40]     Slab  0: 3fffffffffe x  6144:  3072 used   10K res  256K alloc from heap 0, efficiency 30.00%  [../src/dispatch.c:954]
[libplacebo @ 000001bc28406f40]     Pool summary:  3072 used   10K res  256K alloc, efficiency 30.00%, utilization 3.91%
[libplacebo @ 000001bc28406f40] Memory summary: 6383M used   11G res   11G alloc, efficiency 55.33%, utilization 99.94%, max page:  750M
[libplacebo @ 000001bc28406f40]   Backtrace:
[libplacebo @ 000001bc28406f40]     #0  0x7fffc7422a73 in vk_malloc_print_stats+0x1983 (C:\Users\Username\Desktop\FFMpeg_Test\libplacebo-271.dll+0x92a73) (0x1d6202a73)
[libplacebo @ 000001bc28406f40]     #1  0x7fffc7424045 in vk_malloc_slice+0x6c5 (C:\Users\Username\Desktop\FFMpeg_Test\libplacebo-271.dll+0x94045) (0x1d6204045)
[libplacebo @ 000001bc28406f40]     #2  0x7fffc741b8ec in vk_tex_create+0x7dc (C:\Users\Username\Desktop\FFMpeg_Test\libplacebo-271.dll+0x8b8ec) (0x1d61fb8ec)
[libplacebo @ 000001bc28406f40]     #3  0x7fffc73ac2f0 in pl_tex_recreate+0x200 (C:\Users\Username\Desktop\FFMpeg_Test\libplacebo-271.dll+0x1c2f0) (0x1d618c2f0)
[libplacebo @ 000001bc28406f40]     #4  0x7fffc73eb973 in pl_upload_plane+0xb3 (C:\Users\Username\Desktop\FFMpeg_Test\libplacebo-271.dll+0x5b973) (0x1d61cb973)
[libplacebo @ 000001bc28406f40]     #5  0x7ff62f46156c (C:\Users\Username\Desktop\FFMpeg_Test\ffmpeg.exe+0x13156c) (0x14013156c)
[libplacebo @ 000001bc28406f40]     #6  0x7ff62f46172d (C:\Users\Username\Desktop\FFMpeg_Test\ffmpeg.exe+0x13172d) (0x14013172d)
[libplacebo @ 000001bc28406f40]     #7  0x7fffc73e83cc in pl_hdr_metadata_from_dovi_rpu+0x112c (C:\Users\Username\Desktop\FFMpeg_Test\libplacebo-271.dll+0x583cc) (0x1d61c83cc)
[libplacebo @ 000001bc28406f40]     #8  0x7fffc73ea423 in pl_queue_update+0xdb3 (C:\Users\Username\Desktop\FFMpeg_Test\libplacebo-271.dll+0x5a423) (0x1d61ca423)
[libplacebo @ 000001bc28406f40]     #9  0x7ff62f46307b (C:\Users\Username\Desktop\FFMpeg_Test\ffmpeg.exe+0x13307b) (0x14013307b)
[libplacebo @ 000001bc28406f40]     #10 0x7ff62f38a0f9 (C:\Users\Username\Desktop\FFMpeg_Test\ffmpeg.exe+0x5a0f9) (0x14005a0f9)
[libplacebo @ 000001bc28406f40]     #11 0x7ff62f38dbcd (C:\Users\Username\Desktop\FFMpeg_Test\ffmpeg.exe+0x5dbcd) (0x14005dbcd)
[libplacebo @ 000001bc28406f40]     #12 0x7ff62f38d0a6 (C:\Users\Username\Desktop\FFMpeg_Test\ffmpeg.exe+0x5d0a6) (0x14005d0a6)
[libplacebo @ 000001bc28406f40]     #13 0x7ff62f33f8ea (C:\Users\Username\Desktop\FFMpeg_Test\ffmpeg.exe+0xf8ea) (0x14000f8ea)
[libplacebo @ 000001bc28406f40]     #14 0x7ff6302c7307 (C:\Users\Username\Desktop\FFMpeg_Test\ffmpeg.exe+0xf97307) (0x140f97307)
[libplacebo @ 000001bc28406f40]     #15 0x7ff62f3312ed (C:\Users\Username\Desktop\FFMpeg_Test\ffmpeg.exe+0x12ed) (0x1400012ed)
[libplacebo @ 000001bc28406f40]     #16 0x7ff62f331405 (C:\Users\Username\Desktop\FFMpeg_Test\ffmpeg.exe+0x1405) (0x140001405)
[libplacebo @ 000001bc28406f40]     #17 0x7ff8108e7613 in BaseThreadInitThunk+0x13 (C:\Windows\System32\KERNEL32.DLL+0x17613) (0x180017613)
[libplacebo @ 000001bc28406f40]     #18 0x7ff811a426a0 in RtlUserThreadStart+0x20 (C:\Windows\SYSTEM32\ntdll.dll+0x526a0) (0x1800526a0)
[libplacebo @ 000001bc28406f40]   for malloc: ../src/utils/upload.c:245
[libplacebo @ 000001bc28406f40] No slab to serve request for   14M bytes (with alignment 0x400) in pool 0!
[libplacebo @ 000001bc28406f40] Failed initializing plane texture!
[libplacebo @ 000001bc28406f40] Failed mapping frame id 4175 with PTS 174.132004
Error while filtering: Generic error in an external library
Last message repeated 1 times
Failed to inject frame into filter network: Generic error in an external library
Error while processing the decoded data for stream #0:0

So, as far as I am able to determine, the conclusion made in https://github.com/haasn/libplacebo/issues/170 still stands: The most recent version of libplacebo and FFMpeg that work correctly together are: cedacbfbc96c2dbc9ccba0cda8b2392d618d1fc0 and FFMpeg - 2dd9b4071cebd25af2820f26c5bc0a92ef0145fe.

FarisR99 commented 1 year ago

Note on the issue with missing dynamicRendering, I don't know if this is still the case but a week ago I built the latest libplacebo with the latest ffmpeg from Cyanreg's vulkan branch and it worked (no dynamicRendering error) unlike the latest ffmpeg master (not sure if it would lead to memory leaks after some time, didn't spend longer than a minute). However the performance of shaders with libplacebo was around 3/4 the FPS with libplacebo on the latest ffmpeg master.

There were a few things I changed at once that I might have missed so I don't know if Cyanreg's vulkan branch was actually led to it working.

Chipcraft commented 1 year ago

Thanks for the info, cyanreg's vulkan branch indeed does work, due to it supporting the required features of Vulkan 1.3.

Based on that, I butchered together a patch based on following commits of cyanreg's vulkan repo.

cyanreg/FFmpeg@61f6700fc937fda206b88208cd463154c810c13b cyanreg/FFmpeg@fa610949588d1ac95257b13b8e18a838aebbaf24 cyanreg/FFmpeg@1fb2f06c732101f9c8f88189359c40cff3a166b4 cyanreg/FFmpeg@1f7422ebf37315a0a0a6e3ffa2e657406045072b cyanreg/FFmpeg@b87c607cdb340049a210913d1a2cbb3fb327d717 cyanreg/FFmpeg@62121e5fb7624d8268666db7a063bad2f5e23837 cyanreg/FFmpeg@684ce8c259dba4242a21e5fdbb0d472515810231

With this applied on the current FFMpeg master, everything APPEARS to work fine when used together with the current libplacebo 04b4e919ea494329a1133e1efaa08878e5bd2406 master. The performance in tone mapping (hable) at least is identical to "the last know good config" (i.e., FFMpeg 2dd9b40 + libplacebo cedacbfbc96c2dbc9ccba0cda8b2392d618d1fc0), as it the VRAM usage (~ 2.7GB for 2160p), meaning there is no memory leak.

Obviously this is not a real solution however, it seems to work for now.

diff --git a/configure b/configure
index 87f7afc2e1..201efd4f94 100755
--- a/configure
+++ b/configure
@@ -7040,8 +7040,8 @@ enabled crystalhd && check_lib crystalhd "stdint.h libcrystalhd/libcrystalhd_if.
          "in maintaining it."

 if enabled vulkan; then
-    check_pkg_config_header_only vulkan "vulkan >= 1.2.189" "vulkan/vulkan.h" "defined VK_VERSION_1_2" ||
-        check_cpp_condition vulkan "vulkan/vulkan.h" "defined(VK_VERSION_1_3) || (defined(VK_VERSION_1_2) && VK_HEADER_VERSION >= 189)"
+    check_pkg_config_header_only vulkan "vulkan >= 1.3.238" "vulkan/vulkan.h" "defined VK_VERSION_1_3" ||
+        check_cpp_condition vulkan "vulkan/vulkan.h" "defined(VK_VERSION_1_4) || (defined(VK_VERSION_1_3) && VK_HEADER_VERSION >= 238)"
 fi

 if enabled x86; then
diff --git a/libavfilter/vf_libplacebo.c b/libavfilter/vf_libplacebo.c
index adebcc4541..2508f36cac 100644
--- a/libavfilter/vf_libplacebo.c
+++ b/libavfilter/vf_libplacebo.c
@@ -574,7 +574,7 @@ static int init_vulkan(AVFilterContext *avctx, const AVVulkanDeviceContext *hwct
                 .count = hwctx->nb_tx_queues,
             },
             /* This is the highest version created by hwcontext_vulkan.c */
-            .max_api_version = VK_API_VERSION_1_2,
+            .max_api_version = VK_API_VERSION_1_3,
         ));
     } else {
         s->vulkan = pl_vulkan_create(s->log, pl_vulkan_params(
diff --git a/libavutil/hwcontext_vulkan.c b/libavutil/hwcontext_vulkan.c
index ffd4f5dec4..65e1d9eb10 100644
--- a/libavutil/hwcontext_vulkan.c
+++ b/libavutil/hwcontext_vulkan.c
@@ -89,6 +89,8 @@ typedef struct VulkanDevicePriv {
     /* Features */
     VkPhysicalDeviceVulkan11Features device_features_1_1;
     VkPhysicalDeviceVulkan12Features device_features_1_2;
+    VkPhysicalDeviceVulkan13Features device_features_1_3;
+    VkPhysicalDeviceDescriptorBufferFeaturesEXT desc_buf_features;

     /* Queues */
     uint32_t qfs[5];
@@ -339,14 +341,16 @@ typedef struct VulkanOptExtension {
 } VulkanOptExtension;

 static const VulkanOptExtension optional_instance_exts[] = {
-    /* For future use */
+    /* Pointless, here avoid zero-sized structs */
+    { VK_KHR_PORTABILITY_ENUMERATION_EXTENSION_NAME,          FF_VK_EXT_NO_FLAG                },
 };

 static const VulkanOptExtension optional_device_exts[] = {
     /* Misc or required by other extensions */
+    { VK_KHR_PORTABILITY_SUBSET_EXTENSION_NAME,               FF_VK_EXT_NO_FLAG                },
     { VK_KHR_PUSH_DESCRIPTOR_EXTENSION_NAME,                  FF_VK_EXT_NO_FLAG                },
     { VK_KHR_SAMPLER_YCBCR_CONVERSION_EXTENSION_NAME,         FF_VK_EXT_NO_FLAG                },
-    { VK_KHR_SYNCHRONIZATION_2_EXTENSION_NAME,                FF_VK_EXT_NO_FLAG                },
+    { VK_EXT_DESCRIPTOR_BUFFER_EXTENSION_NAME,                FF_VK_EXT_DESCRIPTOR_BUFFER,     },

     /* Imports/exports */
     { VK_KHR_EXTERNAL_MEMORY_FD_EXTENSION_NAME,               FF_VK_EXT_EXTERNAL_FD_MEMORY     },
@@ -673,7 +677,7 @@ static int create_instance(AVHWDeviceContext *ctx, AVDictionary *opts)
     VkApplicationInfo application_info = {
         .sType              = VK_STRUCTURE_TYPE_APPLICATION_INFO,
         .pEngineName        = "libavutil",
-        .apiVersion         = VK_API_VERSION_1_2,
+        .apiVersion         = VK_API_VERSION_1_3,
         .engineVersion      = VK_MAKE_VERSION(LIBAVUTIL_VERSION_MAJOR,
                                               LIBAVUTIL_VERSION_MINOR,
                                               LIBAVUTIL_VERSION_MICRO),
@@ -1326,9 +1330,17 @@ static int vulkan_device_create_internal(AVHWDeviceContext *ctx,
     VkPhysicalDeviceTimelineSemaphoreFeatures timeline_features = {
         .sType = VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_TIMELINE_SEMAPHORE_FEATURES,
     };
+    VkPhysicalDeviceDescriptorBufferFeaturesEXT desc_buf_features = {
+        .sType = VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_DESCRIPTOR_BUFFER_FEATURES_EXT,
+        .pNext = &timeline_features,
+    };
+    VkPhysicalDeviceVulkan13Features dev_features_1_3 = {
+        .sType = VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_VULKAN_1_3_FEATURES,
+        .pNext = &desc_buf_features,
+    };
     VkPhysicalDeviceVulkan12Features dev_features_1_2 = {
         .sType = VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_VULKAN_1_2_FEATURES,
-        .pNext = &timeline_features,
+        .pNext = &dev_features_1_3,
     };
     VkPhysicalDeviceVulkan11Features dev_features_1_1 = {
         .sType = VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_VULKAN_1_1_FEATURES,
@@ -1341,14 +1353,17 @@ static int vulkan_device_create_internal(AVHWDeviceContext *ctx,

     VkDeviceCreateInfo dev_info = {
         .sType                = VK_STRUCTURE_TYPE_DEVICE_CREATE_INFO,
-        .pNext                = &hwctx->device_features,
-    };
+    };

     hwctx->device_features.sType = VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_FEATURES_2;
     hwctx->device_features.pNext = &p->device_features_1_1;
     p->device_features_1_1.sType = VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_VULKAN_1_1_FEATURES;
     p->device_features_1_1.pNext = &p->device_features_1_2;
     p->device_features_1_2.sType = VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_VULKAN_1_2_FEATURES;
+    p->device_features_1_2.pNext = &p->device_features_1_3;
+    p->device_features_1_3.sType = VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_VULKAN_1_3_FEATURES;
+    p->device_features_1_3.pNext = &p->desc_buf_features;
+    p->desc_buf_features.sType = VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_DESCRIPTOR_BUFFER_FEATURES_EXT;
     ctx->free = vulkan_device_free;

     /* Create an instance if not given one */
@@ -1369,6 +1384,8 @@ static int vulkan_device_create_internal(AVHWDeviceContext *ctx,
     COPY_FEATURE(hwctx->device_features, fragmentStoresAndAtomics)
     COPY_FEATURE(hwctx->device_features, vertexPipelineStoresAndAtomics)
     COPY_FEATURE(hwctx->device_features, shaderInt64)
+    COPY_FEATURE(hwctx->device_features, shaderInt16)
+    COPY_FEATURE(hwctx->device_features, shaderFloat64)
 #undef COPY_FEATURE

     /* We require timeline semaphores */
@@ -1377,7 +1394,35 @@ static int vulkan_device_create_internal(AVHWDeviceContext *ctx,
         err = AVERROR(ENOSYS);
         goto end;
     }
+
+    p->device_features_1_1.samplerYcbcrConversion = dev_features_1_1.samplerYcbcrConversion;
+    p->device_features_1_1.storagePushConstant16 = dev_features_1_1.storagePushConstant16;
+
     p->device_features_1_2.timelineSemaphore = 1;
+    p->device_features_1_2.bufferDeviceAddress = dev_features_1_2.bufferDeviceAddress;
+    p->device_features_1_2.hostQueryReset = dev_features_1_2.hostQueryReset;
+    p->device_features_1_2.storagePushConstant8 = dev_features_1_2.storagePushConstant8;
+    p->device_features_1_2.shaderInt8 = dev_features_1_2.shaderInt8;
+    p->device_features_1_2.storageBuffer8BitAccess = dev_features_1_2.storageBuffer8BitAccess;
+    p->device_features_1_2.uniformAndStorageBuffer8BitAccess = dev_features_1_2.uniformAndStorageBuffer8BitAccess;
+    p->device_features_1_2.shaderFloat16 = dev_features_1_2.shaderFloat16;
+    p->device_features_1_2.shaderSharedInt64Atomics = dev_features_1_2.shaderSharedInt64Atomics;
+    p->device_features_1_2.vulkanMemoryModel = dev_features_1_2.vulkanMemoryModel;
+    p->device_features_1_2.vulkanMemoryModelDeviceScope = dev_features_1_2.vulkanMemoryModelDeviceScope;
+    p->device_features_1_2.hostQueryReset = dev_features_1_2.hostQueryReset;
+
+    p->device_features_1_3.dynamicRendering = dev_features_1_3.dynamicRendering;
+    p->device_features_1_3.maintenance4 = dev_features_1_3.maintenance4;
+    p->device_features_1_3.synchronization2 = dev_features_1_3.synchronization2;
+    p->device_features_1_3.computeFullSubgroups = dev_features_1_3.computeFullSubgroups;
+    p->device_features_1_3.shaderZeroInitializeWorkgroupMemory = dev_features_1_3.shaderZeroInitializeWorkgroupMemory;
+    p->device_features_1_3.dynamicRendering = dev_features_1_3.dynamicRendering;
+
+    p->desc_buf_features.descriptorBuffer = desc_buf_features.descriptorBuffer;
+    p->desc_buf_features.descriptorBufferPushDescriptors = desc_buf_features.descriptorBufferPushDescriptors;
+
+
+    dev_info.pNext = &hwctx->device_features;

     /* Setup queue family */
     if ((err = setup_queue_families(ctx, &dev_info)))
diff --git a/libavutil/hwcontext_vulkan.h b/libavutil/hwcontext_vulkan.h
index df86c85b3c..70c8379dc3 100644
--- a/libavutil/hwcontext_vulkan.h
+++ b/libavutil/hwcontext_vulkan.h
@@ -53,7 +53,7 @@ typedef struct AVVulkanDeviceContext {
     PFN_vkGetInstanceProcAddr get_proc_addr;

     /**
-     * Vulkan instance. Must be at least version 1.2.
+     * Vulkan instance. Must be at least version 1.3.
      */
     VkInstance inst;

diff --git a/libavutil/vulkan_functions.h b/libavutil/vulkan_functions.h
index d15a5d9a42..9e7a2ddd2b 100644
--- a/libavutil/vulkan_functions.h
+++ b/libavutil/vulkan_functions.h
@@ -37,6 +37,7 @@ typedef enum FFVulkanExtensions {
     FF_VK_EXT_EXTERNAL_WIN32_MEMORY  = 1ULL <<  6, /* VK_KHR_external_memory_win32 */
     FF_VK_EXT_EXTERNAL_WIN32_SEM     = 1ULL <<  7, /* VK_KHR_external_semaphore_win32 */
 #endif
+    FF_VK_EXT_DESCRIPTOR_BUFFER      = 1ULL <<  8, /* VK_EXT_descriptor_buffer */

     FF_VK_EXT_NO_FLAG                = 1ULL << 31,
 } FFVulkanExtensions;
@@ -120,6 +121,7 @@ typedef enum FFVulkanExtensions {
     MACRO(1, 1, FF_VK_EXT_NO_FLAG,              GetBufferMemoryRequirements2)            \
     MACRO(1, 1, FF_VK_EXT_NO_FLAG,              CreateBuffer)                            \
     MACRO(1, 1, FF_VK_EXT_NO_FLAG,              BindBufferMemory)                        \
+    MACRO(1, 1, FF_VK_EXT_NO_FLAG,              GetBufferDeviceAddress)                  \
     MACRO(1, 1, FF_VK_EXT_NO_FLAG,              DestroyBuffer)                           \
                                                                                          \
     /* Image */                                                                          \
@@ -140,11 +142,21 @@ typedef enum FFVulkanExtensions {
     MACRO(1, 1, FF_VK_EXT_NO_FLAG,              CreateDescriptorPool)                    \
     MACRO(1, 1, FF_VK_EXT_NO_FLAG,              DestroyDescriptorPool)                   \
     MACRO(1, 1, FF_VK_EXT_NO_FLAG,              DestroyDescriptorSetLayout)              \
+                                                                                         \
+    /* Descriptor buffers */                                                             \
+    MACRO(1, 1, FF_VK_EXT_DESCRIPTOR_BUFFER,    GetDescriptorSetLayoutSizeEXT)           \
+    MACRO(1, 1, FF_VK_EXT_DESCRIPTOR_BUFFER,    GetDescriptorSetLayoutBindingOffsetEXT)  \
+    MACRO(1, 1, FF_VK_EXT_DESCRIPTOR_BUFFER,    GetDescriptorEXT)                        \
+    MACRO(1, 1, FF_VK_EXT_DESCRIPTOR_BUFFER,    CmdBindDescriptorBuffersEXT)             \
+    MACRO(1, 1, FF_VK_EXT_DESCRIPTOR_BUFFER,    CmdSetDescriptorBufferOffsetsEXT)        \
                                                                                          \
     /* DescriptorUpdateTemplate */                                                       \
     MACRO(1, 1, FF_VK_EXT_NO_FLAG,              UpdateDescriptorSetWithTemplate)         \
     MACRO(1, 1, FF_VK_EXT_NO_FLAG,              CreateDescriptorUpdateTemplate)          \
     MACRO(1, 1, FF_VK_EXT_NO_FLAG,              DestroyDescriptorUpdateTemplate)         \
+                                                                                         \
+    /* sync2 */                                                                          \
+    MACRO(1, 1, FF_VK_EXT_NO_FLAG,              CmdPipelineBarrier2)                     \
                                                                                          \
     /* Pipeline */                                                                       \
     MACRO(1, 1, FF_VK_EXT_NO_FLAG,              CreatePipelineLayout)                    \
@@ -155,6 +167,8 @@ typedef enum FFVulkanExtensions {
     MACRO(1, 1, FF_VK_EXT_NO_FLAG,              DestroyPipeline)                         \
                                                                                          \
     /* Sampler */                                                                        \
+    MACRO(1, 1, FF_VK_EXT_NO_FLAG,              CreateSamplerYcbcrConversion)            \
+    MACRO(1, 1, FF_VK_EXT_NO_FLAG,              DestroySamplerYcbcrConversion)           \
     MACRO(1, 1, FF_VK_EXT_NO_FLAG,              CreateSampler)                           \
     MACRO(1, 1, FF_VK_EXT_NO_FLAG,              DestroySampler)                          \
                                                                                          \
diff --git a/libavutil/vulkan_loader.h b/libavutil/vulkan_loader.h
index 3f1ee6aa46..2a1d42cddd 100644
--- a/libavutil/vulkan_loader.h
+++ b/libavutil/vulkan_loader.h
@@ -48,6 +48,7 @@ static inline uint64_t ff_vk_extensions_to_mask(const char * const *extensions,
         { VK_KHR_EXTERNAL_MEMORY_WIN32_EXTENSION_NAME,     FF_VK_EXT_EXTERNAL_WIN32_MEMORY  },
         { VK_KHR_EXTERNAL_SEMAPHORE_WIN32_EXTENSION_NAME,  FF_VK_EXT_EXTERNAL_WIN32_SEM     },
 #endif
+        { VK_EXT_DESCRIPTOR_BUFFER_EXTENSION_NAME,         FF_VK_EXT_DESCRIPTOR_BUFFER,     },
     };

     FFVulkanExtensions mask = 0x0;
Chipcraft commented 1 year ago

Did some testing and apparently at least Intel Gen. 11 (Jasper Lake GT-1 32EU) w/ IGD 31.0.101.2121 WHQL) crap in the bed driver wise, since all of the functions fail. Not a real surprise I assume, considering the general state of the drivers. Intel does not support Vulkan 1.1 "samplerYcbcrConversion" -feature or Vulkan 1.2 "shaderSharedInt64Atomics" -feature, which separates it from AMD and NVIDIA however, I wouldn't expect that to be the reason.

libplacebo tests pass without issues, as long as the timeout is increased or disabled (meson test --timeout-multiplier=0 or xx) as this is a SLOW system.

[libplacebo @ 000002234ad59bc0] Validation failed: !params->renderable || fmt_caps & PL_FMT_CAP_RENDERABLE (../src/gpu.c:234)
[libplacebo @ 000002234ad59bc0]   Backtrace:
[libplacebo @ 000002234ad59bc0]     #0  0x7ffdf364b1bd in pl_tex_create+0x3cd (C:\Users\Username\Desktop\FFTest\libplacebo-278.dll+0x2b1bd) (0x1e226b1bd)
[libplacebo @ 000002234ad59bc0]     #1  0x7ffdf364b8c0 in pl_tex_recreate+0x200 (C:\Users\Username\Desktop\FFTest\libplacebo-278.dll+0x2b8c0) (0x1e226b8c0)
[libplacebo @ 000002234ad59bc0]     #2  0x7ffdf368b151 in pl_recreate_plane+0xa1 (C:\Users\Username\Desktop\FFTest\libplacebo-278.dll+0x6b151) (0x1e22ab151)
[libplacebo @ 000002234ad59bc0]     #3  0x7ff6508b5aa4 (C:\Users\Username\Desktop\FFTest\ffmpeg.exe+0x135aa4) (0x140135aa4)
[libplacebo @ 000002234ad59bc0]     #4  0x7ff6508b5e5a (C:\Users\Username\Desktop\FFTest\ffmpeg.exe+0x135e5a) (0x140135e5a)
[libplacebo @ 000002234ad59bc0]     #5  0x7ff6507da6a9 (C:\Users\Username\Desktop\FFTest\ffmpeg.exe+0x5a6a9) (0x14005a6a9)
[libplacebo @ 000002234ad59bc0]     #6  0x7ff6507decdf (C:\Users\Username\Desktop\FFTest\ffmpeg.exe+0x5ecdf) (0x14005ecdf)
[libplacebo @ 000002234ad59bc0]     #7  0x7ff650791317 (C:\Users\Username\Desktop\FFTest\ffmpeg.exe+0x11317) (0x140011317)
[libplacebo @ 000002234ad59bc0]     #8  0x7ff650781bbf (C:\Users\Username\Desktop\FFTest\ffmpeg.exe+0x1bbf) (0x140001bbf)
[libplacebo @ 000002234ad59bc0]     #9  0x7ff65171e5ab (C:\Users\Username\Desktop\FFTest\ffmpeg.exe+0xf9e5ab) (0x140f9e5ab)
[libplacebo @ 000002234ad59bc0]     #10 0x7ff6507812ed (C:\Users\Username\Desktop\FFTest\ffmpeg.exe+0x12ed) (0x1400012ed)
[libplacebo @ 000002234ad59bc0]     #11 0x7ff650781405 (C:\Users\Username\Desktop\FFTest\ffmpeg.exe+0x1405) (0x140001405)
[libplacebo @ 000002234ad59bc0]     #12 0x7ffe1b707613 in BaseThreadInitThunk+0x13 (C:\Windows\System32\KERNEL32.DLL+0x17613) (0x180017613)
[libplacebo @ 000002234ad59bc0]     #13 0x7ffe1cd626a0 in RtlUserThreadStart+0x20 (C:\Windows\SYSTEM32\ntdll.dll+0x526a0) (0x1800526a0)
[libplacebo @ 000002234ad59bc0]   for texture: ../src/utils/upload.c:356
[libplacebo @ 000002234ad59bc0] Failed initializing plane texture!
Error while filtering: Generic error in an external library
Failed to inject frame into filter network: Generic error in an external library
Conversion failed!

Meanwhile following GPUs do work:

Chipcraft commented 1 year ago

Solved by FFMpeg master baa9fccf8d72be259024d7e0eb919c909714c6a7 & libplacebo e68461922179f33255e30068fa8e5fca622861a3.

Will create a separate issue for the Intel GPU issue.