mpv-player / mpv

🎥 Command line video player
https://mpv.io
Other
26.74k stars 2.84k forks source link

Segfault with --gpu-api=vulkan #12468

Open orbea opened 9 months ago

orbea commented 9 months ago

Important Information

Provide following Information:

Reproduction steps

This causes a segfault.

/usr/bin/mpv --no-config --gpu-api=vulkan test.mkv

There is no segfault of --gpu-api=vulkan is removed.

Expected behavior

Should not segfault.

Actual behavior

Segfault.

Log file

Thread 12 "mpv/vo" received signal SIGSEGV, Segmentation fault.
[Switching to LWP 3519]
0x00005555556a4fb1 in gl_video_init (ra=0xfffffc70e9000000, log=0x376800007e2225ff, 
    g=0xfffffc80e9000000) at ../mpv-9999/video/out/gpu/video.c:4048
4048    {
(gdb) bt
#0  0x00005555556a4fb1 in gl_video_init (ra=0xfffffc70e9000000, log=0x376800007e2225ff, 
    g=0xfffffc80e9000000) at ../mpv-9999/video/out/gpu/video.c:4048
#1  0x00005555556af594 in preinit (vo=0x7ffff2dec210) at ../mpv-9999/video/out/vo_gpu.c:302
#2  0x00005555556ad98a in vo_thread (ptr=0x7ffff2dec210) at ../mpv-9999/video/out/vo.c:1102
#3  0x00007ffff7fb59be in ?? () from /lib/ld-musl-x86_64.so.1
#4  0x0000000000000000 in ?? ()
(gdb) bt full
#0  0x00005555556a4fb1 in gl_video_init (ra=0xfffffc70e9000000, log=0x376800007e2225ff, 
    g=0xfffffc80e9000000) at ../mpv-9999/video/out/gpu/video.c:4048
        p = 0xfffffc60e9000000
        opts = 0x396800007e1225ff
#1  0x00005555556af594 in preinit (vo=0x7ffff2dec210) at ../mpv-9999/video/out/vo_gpu.c:302
        p = 0x7ffff381ff40
        ctx_opts = 0x7ffff29601b0
        gl_opts = 0x7ffff2828860
        opts = {allow_sw = false, want_alpha = false, debug = false, probing = false, 
          context_name = 0x0, context_type = 0x7ffff37df9a0 "vulkan"}
#2  0x00005555556ad98a in vo_thread (ptr=0x7ffff2dec210) at ../mpv-9999/video/out/vo.c:1102
        vo = 0x7ffff2dec210
        in = 0x7ffff38054f0
        vo_paused = false
        r = 0
#3  0x00007ffff7fb59be in ?? () from /lib/ld-musl-x86_64.so.1
No symbol table info available.
#4  0x0000000000000000 in ?? ()
No symbol table info available.

mpv.log

Sample files

Every file I tried reproduced this, for example:

https://rumble.com/v3ia9v2--bach-the-brandenburg-concerto-no.-3-bwv-1048-.html

Dudemanguy commented 8 months ago

Sorry, seems like this one got lost. Does it also happen with gpu-next?

orbea commented 8 months ago

Yes, my default config uses gpu-next and I can still reproduce a segfault with:

/usr/bin/mpv --no-config --gpu-api=vulkan --vo=gpu-next
Dudemanguy commented 8 months ago

Could you make a backtrace using gpu-next? The vo_gpu one above points to a line that really shouldn't segfault and has nothing in particular to do with vulkan.

llyyr commented 8 months ago

You mention amdvlk but link a mesa commit, which driver are you using? If you're using amdvlk, can you try radv?

orbea commented 8 months ago

Could you make a backtrace using gpu-next? The vo_gpu one above points to a line that really shouldn't segfault and has nothing in particular to do with vulkan.

Now its segfaulting in mesa.

(gdb) bt
#0  0x00007ffff2199598 in nir_lower_io_to_vector_impl (impl=0x0, modes=0) at ../mesa-9999/src/compiler/nir/nir_lower_io_to_vector.c:411
#1  0x00007ffff219a070 in nir_lower_io_to_vector (shader=0x7fffea622030, modes=nir_var_shader_out) at ../mesa-9999/src/compiler/nir/nir_lower_io_to_vector.c:612
#2  0x00007ffff1f9f060 in radv_shader_spirv_to_nir (device=0x7ffff1c8c0d0, stage=0x7ffff2c8cc10, key=0x7ffff2c9e720, is_internal=false) at ../mesa-9999/src/amd/vulkan/radv_shader.c:514
#3  0x00007ffff1f6d479 in radv_graphics_shaders_compile (device=0x7ffff1c8c0d0, cache=0x7fffea625210, stages=0x7ffff2c85c50, pipeline_key=0x7ffff2c9e720, keep_executable_info=false, 
    keep_statistic_info=false, is_internal=false, retained_shaders=0x0, noop_fs=false, shaders=0x7fffeb898a60, binaries=0x7ffff2c85be0, gs_copy_shader=0x7fffeb898ad0, 
    gs_copy_binary=0x7ffff2c85b78) at ../mesa-9999/src/amd/vulkan/radv_pipeline_graphics.c:2515
#4  0x00007ffff1f6e5fd in radv_graphics_pipeline_compile (pipeline=0x7fffeb898a00, pCreateInfo=0x7ffff2c9ec10, pipeline_layout=0x7ffff2c9e4b0, device=0x7ffff1c8c0d0, 
    cache=0x7fffea625210, pipeline_key=0x7ffff2c9e720, lib_flags=15, fast_linking_enabled=false) at ../mesa-9999/src/amd/vulkan/radv_pipeline_graphics.c:2763
#5  0x00007ffff1f72290 in radv_graphics_pipeline_init (pipeline=0x7fffeb898a00, device=0x7ffff1c8c0d0, cache=0x7fffea625210, pCreateInfo=0x7ffff2c9ec10, extra=0x0)
    at ../mesa-9999/src/amd/vulkan/radv_pipeline_graphics.c:3974
#6  0x00007ffff1f7288b in radv_graphics_pipeline_create (_device=0x7ffff1c8c0d0, _cache=0x7fffea625210, pCreateInfo=0x7ffff2c9ec10, extra=0x0, pAllocator=0x0, pPipeline=0x7fffea7cd330)
    at ../mesa-9999/src/amd/vulkan/radv_pipeline_graphics.c:4073
#7  0x00007ffff1f72fea in radv_CreateGraphicsPipelines (_device=0x7ffff1c8c0d0, pipelineCache=0x7fffea625210, count=1, pCreateInfos=0x7ffff2c9ec10, pAllocator=0x0, 
    pPipelines=0x7fffea7cd330) at ../mesa-9999/src/amd/vulkan/radv_pipeline_graphics.c:4216
#8  0x00007ffff61d6c08 in vk_recreate_pipelines (vk=0x7ffff1d12190, pass=0x7fffea7cd290, derivable=true, base=0x0, out_pipe=0x7fffea7cd330)
    at ../libplacebo-9999/src/vulkan/gpu_pass.c:253
#9  0x00007ffff61d8925 in vk_pass_create (gpu=0x7ffff3896370, params=0x7ffff2c9f4a0) at ../libplacebo-9999/src/vulkan/gpu_pass.c:571
#10 0x00007ffff617c881 in pl_pass_create (gpu=0x7ffff3896370, params=0x7ffff2c9f4a0) at ../libplacebo-9999/src/gpu.c:1080
#11 0x00007ffff6168d43 in finalize_pass (dp=0x7ffff1da74e0, sh=0x7fffea7cd870, target=0x7fffea662d60, vert_idx=2, blend=0x0, load=false, vparams=0x0, proj=0x0)
    at ../libplacebo-9999/src/dispatch.c:926
#12 0x00007ffff616aafe in pl_dispatch_finish (dp=0x7ffff1da74e0, params=0x7ffff2ca0030) at ../libplacebo-9999/src/dispatch.c:1282
#13 0x00007ffff60d5767 in _img_tex (pass=0x7ffff2ca0920, img=0x7ffff2ca0518, tag=0x7ffff61e4d72 "src/renderer.c:1647") at src/renderer.c:518
#14 0x00007ffff60dbbfc in pass_read_image (pass=0x7ffff2ca0920) at src/renderer.c:1647
#15 0x00007ffff60e4f73 in pl_render_image (rr=0x7ffff1dbf270, pimage=0x7fffea7f9fc0, ptarget=0x7ffff2ca45e0, params=0x7ffff2ca44c0) at src/renderer.c:3106
#16 0x00007ffff60e846e in pl_render_image_mix (rr=0x7ffff1dbf270, images=0x7ffff2ca4380, ptarget=0x7ffff2ca45e0, params=0x7ffff2ca44c0) at src/renderer.c:3644
#17 0x00005555556e1181 in draw_frame (vo=0x7ffff38169a0, frame=0x7fffea664050) at ../mpv-9999/video/out/vo_gpu_next.c:1124
#18 0x00005555556b11e8 in render_frame (vo=0x7ffff38169a0) at ../mpv-9999/video/out/vo.c:933
#19 0x00005555556b180d in vo_thread (ptr=0x7ffff38169a0) at ../mpv-9999/video/out/vo.c:1066
#20 0x00007ffff7fb59be in ?? () from /lib/ld-musl-x86_64.so.1
#21 0x0000000000000000 in ?? ()

mpv-gpu-next.log

You mention amdvlk but link a mesa commit, which driver are you using? If you're using amdvlk, can you try radv?

I don't know why I wrote amdvlk, I am using radv, apologies for the confusion.

Dudemanguy commented 8 months ago

Shouldn't be our bug at least. Haven't tried that particular mesa commit myself.

orbea commented 3 months ago

As suggested in #dri-devel @ OFTC building mpv with LDFLAGS=-Wl,-z,stack-size=0x200000 allows --gpu-api=vulkan to work again.

On glibc systems the default thread stack size is on rlimit_stack which is large enough for mpv, but with musl its 128k which causes the segfault.

So I think the correct fix is to explicitly set the stack size, but I am not sure what code base this needs to be done in?

Dudemanguy commented 3 months ago

You can just add it to the link_flags in the meson.build

orbea commented 3 months ago

I wonder if perhaps there is a smarter way and I am unsure if a 2MB stack is an optimal value.

For example see this comment https://github.com/Rosalie241/RMG/issues/219#issuecomment-2018633845 for RMG which I now realize has a similar issue as mpv.

Dudemanguy commented 3 months ago

Yeah I have no idea what a "good" value would be either, but looking it up glibc apparently uses 8MiB which is pretty large.

nekopsykose commented 3 months ago

even picking 8mib for vo_thread only would work and not matter too much since it doesn't affect the other threads. though i looked at it briefly for 2 seconds and it would seemingly take a bit of refactoring to allow creating it with a passed pthread_attr (existing #define mp_thread_create(t, f, a) pthread_create(t, NULL, f, a) plumbs through nothing, the NULL is where you would pass a whole initialised param, etc)

nekopsykose commented 3 months ago

anecdotally speaking, i never had these segfaults on the default 128KiB in musl, so probably it's just really close to that limit and making it 2MiB is definitely enough for everyone.