Closed philipl closed 2 months ago
I spotted this issue (a similar issue?) earlier while testing earlier. The move to a single buffer export seems to have interacted badly with the GOB calculation code (which ran per-plane), and mpv/ffmpeg does some small scale tests (128x128) to make sure it all works, but fails due to an invalid log2GobsPerBlockY value.
I really don't understand the log2GobsPerBlockY
value, or how to compute it correctly, or if it's even compatible with single buffer export. I've pushed a change to master than hard codes it to 4, which works for most things except videos with ~64px high planes.
However, I've never seen it fail to decode, so try master and see if that fixes it, if not I'll have to dig a bit more into it.
Yeah, your fix does stop the failures to import but the failures to decode are still there, sorry to say.
$ NVD_LOG=1 mpv --profile=nvidia-vaapi juddertest_60.mp4 --pause
[auto_hwdec] Applying profile: nvidia
(+) Video --vid=1 (*) (h264 1920x1072 60.000fps)
[vaapi] libva: vaGetDriverNames() failed with unknown libva error
34146.562078372 [69960-69977] ../src/vabackend.c:2187 __vaDriverInit_1_0 Initialising NVIDIA VA-API Driver: 10
34146.562083703 [69960-69977] ../src/vabackend.c:2196 __vaDriverInit_1_0 Now have 0 (0 max) instances
34146.562084788 [69960-69977] ../src/vabackend.c:2222 __vaDriverInit_1_0 Selecting Direct backend
34146.562089785 [69960-69977] ../src/direct/direct-export-buf.c: 68 direct_initExporter Searching for GPU: 0 0 128
34146.562133430 [69960-69977] ../src/backend-common.c: 31 isNvidiaDrmFd Invalid driver for DRM device: i915
34146.562137743 [69960-69977] ../src/direct/direct-export-buf.c: 68 direct_initExporter Searching for GPU: 0 0 129
34146.562141055 [69960-69977] ../src/direct/direct-export-buf.c: 90 direct_initExporter Found NVIDIA GPU 0 at /dev/dri/renderD129
34146.562142610 [69960-69977] ../src/direct/nv-driver.c: 267 init_nvdriver Initing nvdriver...
34146.562149838 [69960-69977] ../src/direct/nv-driver.c: 285 init_nvdriver NVIDIA kernel driver version: 550.78, major version: 550, minor version: 78
34146.562151802 [69960-69977] ../src/direct/nv-driver.c: 292 init_nvdriver Got dev info: 100 1 2 6
34146.607554351 [69960-69977] ../src/vabackend.c:1445 nvQueryImageFormats In nvQueryImageFormats
34146.623883194 [69960-69977] ../src/vabackend.c: 674 nvCreateConfig got profile: 0 with 0 attributes
34146.623894495 [69960-69977] ../src/vabackend.c:1801 nvQuerySurfaceAttributes with 1 (8) (nil) 0
34146.623897119 [69960-69977] ../src/vabackend.c:1801 nvQuerySurfaceAttributes with 1 (8) 0x79dc611d9340 8
34146.624602628 [69960-69977] ../src/vabackend.c:1868 nvQuerySurfaceAttributes Returning constraints: width: 48 - 4080, height: 16 - 4080
34146.624616304 [69960-69977] ../src/vabackend.c: 957 nvCreateSurfaces2 Creating surface 128x128, format 1 (0x79dc611d9540)
34146.624620397 [69960-69977] ../src/vabackend.c:1539 nvDeriveImage In nvDeriveImage
34146.624624720 [69960-69977] ../src/vabackend.c: 957 nvCreateSurfaces2 Creating surface 128x128, format 1 (0x79dc6152f4c0)
34146.624626340 [69960-69977] ../src/vabackend.c:1539 nvDeriveImage In nvDeriveImage
34146.624648872 [69960-69977] ../src/direct/direct-export-buf.c: 151 direct_allocateBackingImage Allocating BackingImage: 0x79dc6152ff20 128x128 = 32768 bytes
34146.624716249 [69960-69977] ../src/direct/direct-export-buf.c: 160 direct_allocateBackingImage Allocate Buffer: 64 65 66
34146.624717709 [69960-69977] ../src/direct/direct-export-buf.c: 170 direct_allocateBackingImage Importing memory to CUDA
34146.625088507 [69960-69977] ../src/vabackend.c: 991 nvDestroySurfaces Destroying surface -1 (0x79dc6152f4c0)
34146.625173013 [69960-69977] ../src/vabackend.c: 957 nvCreateSurfaces2 Creating surface 128x128, format 100 (0x79dc6152f4c0)
34146.625178440 [69960-69977] ../src/vabackend.c:1539 nvDeriveImage In nvDeriveImage
34146.625187926 [69960-69977] ../src/direct/direct-export-buf.c: 151 direct_allocateBackingImage Allocating BackingImage: 0x79dc61531620 128x128 = 65536 bytes
34146.625253794 [69960-69977] ../src/direct/direct-export-buf.c: 160 direct_allocateBackingImage Allocate Buffer: 64 65 66
34146.625255280 [69960-69977] ../src/direct/direct-export-buf.c: 170 direct_allocateBackingImage Importing memory to CUDA
34146.625541598 [69960-69977] ../src/vabackend.c: 991 nvDestroySurfaces Destroying surface -1 (0x79dc6152f4c0)
34146.625609942 [69960-69977] ../src/vabackend.c: 957 nvCreateSurfaces2 Creating surface 128x128, format 1000 (0x79dc615332e0)
34146.625613789 [69960-69977] ../src/vabackend.c:1539 nvDeriveImage In nvDeriveImage
34146.625621802 [69960-69977] ../src/direct/direct-export-buf.c: 151 direct_allocateBackingImage Allocating BackingImage: 0x79dc6152f4e0 128x128 = 65536 bytes
34146.625676212 [69960-69977] ../src/direct/direct-export-buf.c: 160 direct_allocateBackingImage Allocate Buffer: 64 65 66
34146.625678070 [69960-69977] ../src/direct/direct-export-buf.c: 170 direct_allocateBackingImage Importing memory to CUDA
34146.625892240 [69960-69977] ../src/vabackend.c: 991 nvDestroySurfaces Destroying surface -1 (0x79dc615332e0)
34146.625948227 [69960-69977] ../src/vabackend.c: 957 nvCreateSurfaces2 Creating surface 128x128, format 4 (0x79dc6152f600)
34146.625951004 [69960-69977] ../src/vabackend.c:1539 nvDeriveImage In nvDeriveImage
34146.625957259 [69960-69977] ../src/direct/direct-export-buf.c: 151 direct_allocateBackingImage Allocating BackingImage: 0x79dc61532f20 128x128 = 49152 bytes
34146.626004504 [69960-69977] ../src/direct/direct-export-buf.c: 160 direct_allocateBackingImage Allocate Buffer: 64 65 66
34146.626006082 [69960-69977] ../src/direct/direct-export-buf.c: 170 direct_allocateBackingImage Importing memory to CUDA
34146.626307565 [69960-69977] ../src/vabackend.c: 991 nvDestroySurfaces Destroying surface -1 (0x79dc6152f600)
34146.626380607 [69960-69977] ../src/vabackend.c: 991 nvDestroySurfaces Destroying surface -1 (0x79dc611d9540)
34146.626383855 [69960-69977] ../src/vabackend.c: 674 nvCreateConfig got profile: -1 with 0 attributes
34146.626386549 [69960-69977] ../src/vabackend.c: 679 nvCreateConfig Profile not supported: -1
[auto_hwdec] Applying profile: hwdec.vaapi
34146.644517713 [69960-69960] ../src/vabackend.c: 674 nvCreateConfig got profile: 7 with 0 attributes
34146.644528164 [69960-69960] ../src/vabackend.c:1801 nvQuerySurfaceAttributes with 4 (8) (nil) 0
34146.644529893 [69960-69960] ../src/vabackend.c:1801 nvQuerySurfaceAttributes with 4 (8) 0x6266aa5e1b40 8
34146.645337738 [69960-69960] ../src/vabackend.c:1868 nvQuerySurfaceAttributes Returning constraints: width: 48 - 4096, height: 16 - 4096
34146.645348718 [69960-69960] ../src/vabackend.c:1801 nvQuerySurfaceAttributes with 4 (8) (nil) -1
34146.645350331 [69960-69960] ../src/vabackend.c:1801 nvQuerySurfaceAttributes with 4 (8) 0x6266aa5e1b40 8
34146.647208287 [69960-69960] ../src/vabackend.c:1868 nvQuerySurfaceAttributes Returning constraints: width: 48 - 4096, height: 16 - 4096
34146.647224562 [69960-69960] ../src/vabackend.c: 957 nvCreateSurfaces2 Creating surface 1920x1072, format 1 (0x6266aa4e6700)
34146.647227162 [69960-69960] ../src/vabackend.c:1539 nvDeriveImage In nvDeriveImage
34146.663706403 [69960-69960] ../src/vabackend.c: 674 nvCreateConfig got profile: 7 with 0 attributes
34146.663713834 [69960-69960] ../src/vabackend.c:1801 nvQuerySurfaceAttributes with 4 (8) (nil) 0
34146.663715675 [69960-69960] ../src/vabackend.c:1801 nvQuerySurfaceAttributes with 4 (8) 0x6266aa5e1b40 8
34146.664390561 [69960-69960] ../src/vabackend.c:1868 nvQuerySurfaceAttributes Returning constraints: width: 48 - 4096, height: 16 - 4096
34146.664400625 [69960-69960] ../src/vabackend.c:1021 nvCreateContext creating context with 0 render targets, 1 surfaces, at 1920x1072
34146.669852438 [69960-69983] ../src/vabackend.c: 416 resolveSurfaces [RT] Resolve thread for 0x6266aa599f70 started
34146.670421569 [69960-69960] ../src/vabackend.c: 957 nvCreateSurfaces2 Creating surface 1920x1072, format 1 (0x6266aad00de0)
34146.670735411 [69960-69960] ../src/vabackend.c:1344 nvEndPicture cuvidDecodePicture failed: 1
[ffmpeg/video] h264: Failed to end picture decode issue: 23 (internal decoding error).
[ffmpeg/video] h264: hardware accelerator failed to decode picture
Error while decoding frame (hardware decoding)!
34146.670759958 [69960-69960] ../src/vabackend.c: 957 nvCreateSurfaces2 Creating surface 1920x1072, format 1 (0x6266aad03620)
34146.671088295 [69960-69960] ../src/vabackend.c:1344 nvEndPicture cuvidDecodePicture failed: 1
[ffmpeg/video] h264: Failed to end picture decode issue: 23 (internal decoding error).
[ffmpeg/video] h264: hardware accelerator failed to decode picture
Error while decoding frame (hardware decoding)!
34146.671105210 [69960-69960] ../src/vabackend.c: 957 nvCreateSurfaces2 Creating surface 1920x1072, format 1 (0x6266aad03ce0)
34146.671383548 [69960-69960] ../src/vabackend.c:1344 nvEndPicture cuvidDecodePicture failed: 1
[ffmpeg/video] h264: Failed to end picture decode issue: 23 (internal decoding error).
[ffmpeg/video] h264: hardware accelerator failed to decode picture
Error while decoding frame (hardware decoding)!
34146.671413949 [69960-69960] ../src/vabackend.c:1120 nvDestroyContext Destroying context: 10
34146.671415144 [69960-69960] ../src/vabackend.c: 325 destroyContext Signaling resolve thread to exit
34146.671416094 [69960-69960] ../src/vabackend.c: 331 destroyContext Waiting for resolve thread to exit
34146.671628210 [69960-69983] ../src/direct/direct-export-buf.c: 151 direct_allocateBackingImage Allocating BackingImage: 0x79dc10000f50 1920x1072 = 3440640 bytes
34146.671748915 [69960-69983] ../src/direct/direct-export-buf.c: 160 direct_allocateBackingImage Allocate Buffer: 68 69 70
34146.671751103 [69960-69983] ../src/direct/direct-export-buf.c: 170 direct_allocateBackingImage Importing memory to CUDA
34146.671910057 [69960-69983] ../src/vabackend.c: 458 resolveSurfaces [RT] Resolve thread for 0x6266aa599f70 exiting
34146.671925726 [69960-69960] ../src/vabackend.c: 333 destroyContext pthread_timedjoin_np finished with 0
34146.919460597 [69960-69960] ../src/vabackend.c: 991 nvDestroySurfaces Destroying surface 3 (0x6266aad03ce0)
34146.919494576 [69960-69960] ../src/vabackend.c: 991 nvDestroySurfaces Destroying surface 2 (0x6266aad03620)
34146.919502235 [69960-69960] ../src/vabackend.c: 991 nvDestroySurfaces Destroying surface 1 (0x6266aad00de0)
34146.919509923 [69960-69960] ../src/vabackend.c: 991 nvDestroySurfaces Destroying surface 0 (0x6266aa4e6700)
VO: [gpu] 1920x1072 yuv420p
(Paused) V: 00:00:00 / 00:01:04 (0%) Cache: 63s/2MB
I can play files with mpv (v0.36.0) on my machine, both h264 and hevc including 10-bit videos with that NVIDIA driver version, so it's not an inherent problem with the driver.
Any chance you've got some stray mpv config item set?
It's an ffmpeg change. I did a bisect and it came down to:
https://github.com/FFmpeg/FFmpeg/commit/41e3d36a39979a5c6ca36198b03be740e14ef7b0
which hides a lot of implications.
Ok, looking at the log output after looking at the patch, I see the issue. FFMPEG has switched to creating surfaces dynamically, rather than all upfront. This actually wouldn't be a problem if it created 0 surfaces before initialising it as we'd catch that and allocate a fixed amount, but it does allocate 1, which trips us up.
This is problematic, as NVDEC just doesn't work that way. I guess we're going to have to allocate some minimum amount and hope for the best...? I'd be hesitant in allocating too many as those 4k HDR frames are already quite large.
We may need to rethink the decode process. Maybe we could allocate a smaller pool of frames and preemptively copy them to the CUDA images we create, but I think we'd just run into the issue we had early on with not knowing which frames are needed when.
Maybe it's time to take a closer look at VDPAU, and see if it can be convinced to let us export a surface as a DMA-BUF.
Is it possible to treat the n=1 case the same as n=0? No real world decode scenario can actually work with just one decode surface, so we know it's asking for the same dynamic behaviour as n=0.
I separately wonder if it's possible to reinitialise the decoder when the number of surfaces changes. I know there is a reconfigure operation in nvdec now but I don't know if it lets you change the number of surfaces. But even if not, can a full reinit be done without killing playback pacing?
As for vdpau, it might be possible to use some of the undocumented kernel ioctls to get the underlying surface resource and work with it like the direct backend does today, but that sounds scary to me...
We could treat n=1 as n=0, it would be the easiest option.
I did investigate the reconfigure option early on, unfortunately it helpfully destroys/reallocates all the existing surfaces for you so it's not usable without visual artefacts.
For VDPAU 'all' that's needed is a few handles that get allocated to the library and the surface (we might also need the fd that it's using), once we have those we could reuse the export code we already have (in theory at least). However my previous attempts to find where it stored those handles didn't get anywhere. I think we're need to fully decompile the library rather than just trying to step through it in the debugger.
I don't think it's as scary as you think, alot of the direct backend's code is based on how VDPAU created surfaces, so I already know they're very similar.
Hi. I know that mpv isn't actually a real use-case, and accelerated playback in firefox continues to work fine, but I'm seeing that playback no longer works in mpv, and I worry it's a sign of a more serious problem.
It appears to be failing to import dmabufs on the EGL side for the useful formats (NV12, P010, etc) and then it fails to decode (I think because the output surface is incompatible).
Maybe something has regressed on the driver side? It could be in mpv but I can't quickly test old versions here.