FreeBSDDesktop / DEPRECATED-freebsd-base-graphics

Fork of FreeBSD's base repository to work on graphics-stack-related projects
Other
49 stars 13 forks source link

Panic: &dev->struct_mutex not exclusively locked #169

Open vishwin opened 6 years ago

vishwin commented 6 years ago

After using the workaround in #163 to allow use of i915kms at all on my machine, occasionally I will get random panics which first manifest as X and the rest of the operating system hanging for a few seconds, then an indefinite black screen. This tends to happen when I have the Intel chip under quite a bit of load, though again, this can happen at any time. I have not been able to get an automatic core dump until now.

Note that this is the tip of drm-next-4.10 as of this writing. I am sure drm-next is afflicted the same way.

ardmore dumped core - see /var/crash/vmcore.1

Tue Aug 29 22:59:42 EDT 2017

FreeBSD ardmore 12.0-CURRENT FreeBSD 12.0-CURRENT #0 cf2d7cc285b(drm-next-4.10)-dirty: Thu Aug 24 11:39:04 EDT 2017     root@ardmore:/usr/local/obj/usr/src/sys/GENERIC  amd64

panic: Lock v/drm/drm_drv.c:532-&dev->struct_mutex not exclusively locked @ /usr/src/sys/dev/drm/i915/i915_gem.c:3549

GNU gdb (GDB) 8.0 [GDB v8.0 for FreeBSD]
Copyright (C) 2017 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-portbld-freebsd12.0".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /boot/kernel/kernel...Reading symbols from /usr/lib/debug//boot/kernel/kernel.debug...done.
done.

Unread portion of the kernel message buffer:

Fatal trap 9: general protection fault while in kernel mode
cpuid = 0; apic id = 00
instruction pointer = 0x20:0xffffffff80ecc314
stack pointer           = 0x28:0xfffffe0232d50910
frame pointer           = 0x28:0xfffffe0232d50910
code segment        = base 0x0, limit 0xfffff, type 0x1b
            = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags    = interrupt enabled, resume, IOPL = 0
current process     = 0 (linuxkpi_short_wq_1)
Uptime: 2h43m19s
Dumping 843 out of 8042 MB:..2%..12%..21%..31%..42%..52%..61%..71%..82%..92%

__curthread () at ./machine/pcpu.h:232
232     __asm("movq %%gs:%1,%0" : "=r" (td)
(kgdb) #0  __curthread () at ./machine/pcpu.h:232
#1  doadump (textdump=1) at /usr/src/sys/kern/kern_shutdown.c:318
#2  0xffffffff80a660a5 in kern_reboot (howto=260)
    at /usr/src/sys/kern/kern_shutdown.c:386
#3  0xffffffff80a66696 in vpanic (fmt=<optimized out>, ap=0xfffffe0232d50290)
    at /usr/src/sys/kern/kern_shutdown.c:787
#4  0xffffffff80a666e3 in panic (fmt=<unavailable>)
    at /usr/src/sys/kern/kern_shutdown.c:710
#5  0xffffffff80a6ef02 in _sx_assert (sx=<optimized out>, 
    what=<optimized out>, file=<unavailable>, line=<unavailable>)
    at /usr/src/sys/kern/kern_sx.c:1232
#6  0xffffffff853d5362 in i915_gem_object_pin_to_display_plane (
    obj=0xfffff80009d9e000, alignment=4096, view=0xfffffe0232d50308)
    at /usr/src/sys/dev/drm/i915/i915_gem.c:3549
#7  0xffffffff854192f1 in intel_pin_and_fence_fb_obj (fb=0xfffff80005944b00, 
    rotation=1) at /usr/src/sys/dev/drm/i915/intel_display.c:2213
#8  0xffffffff85421ee7 in intel_prepare_plane_fb (plane=0xfffff800055c7000, 
    new_state=0xfffff800055bd900)
    at /usr/src/sys/dev/drm/i915/intel_display.c:14789
#9  0xffffffff854f768a in drm_atomic_helper_prepare_planes (
    dev=<optimized out>, state=0xfffff80200986800)
    at /usr/src/sys/dev/drm/drm_atomic_helper.c:1670
#10 0xffffffff85434d66 in intel_atomic_prepare_commit (state=<optimized out>, 
    dev=<optimized out>) at /usr/src/sys/dev/drm/i915/intel_display.c:14137
#11 intel_atomic_commit (dev=0xfffffe00025df000, state=<optimized out>, 
    nonblock=<optimized out>)
    at /usr/src/sys/dev/drm/i915/intel_display.c:14579
#12 0xffffffff85519fd8 in restore_fbdev_mode_atomic (
    fb_helper=<optimized out>) at /usr/src/sys/dev/drm/drm_fb_helper.c:379
#13 restore_fbdev_mode (fb_helper=<optimized out>)
    at /usr/src/sys/dev/drm/drm_fb_helper.c:406
#14 drm_fb_helper_restore_fbdev_mode_unlocked (fb_helper=<optimized out>)
    at /usr/src/sys/dev/drm/drm_fb_helper.c:462
#15 0xffffffff8553a9e9 in vt_kms_postswitch (arg=<optimized out>)
    at /usr/src/sys/dev/drm/linux_fb.c:84
#16 0xffffffff808ee71b in vt_window_switch (
    vw=0xffffffff8179b068 <vt_conswindow>)
    at /usr/src/sys/dev/vt/vt_core.c:540
#17 0xffffffff808ebee0 in vtterm_cngrab (tm=<optimized out>)
    at /usr/src/sys/dev/vt/vt_core.c:1507
#18 0xffffffff80a08fa2 in cngrab () at /usr/src/sys/kern/kern_cons.c:368
#19 0xffffffff80aa8d09 in kdb_trap (type=9, code=0, tf=<optimized out>)
    at /usr/src/sys/kern/subr_kdb.c:651
#20 0xffffffff80ef09db in trap_fatal (frame=0xfffffe0232d50850, eva=0)
    at /usr/src/sys/amd64/amd64/trap.c:794
#21 0xffffffff80ef005d in trap (frame=0xfffffe0232d50850)
    at /usr/src/sys/amd64/amd64/trap.c:200
#22 <signal handler called>
#23 atomic_fetchadd_int (p=0xdeadc0dedeadc0de, v=4294967295)
    at ./machine/atomic.h:227
#24 0xffffffff853d993f in refcount_release (count=0xdeadc0dedeadc0de)
    at /usr/src/sys/sys/refcount.h:62
#25 kref_put (kref=<optimized out>, rel=<optimized out>)
    at /usr/src/sys/compat/linuxkpi/common/include/linux/kref.h:66
#26 dma_fence_put (fence=0xdeadc0dedeadc0de)
    at /usr/src/sys/compat/linuxkpi/gplv2/include/linux/dma-fence.h:225
#27 reservation_object_fini (obj=<optimized out>)
    at /usr/src/sys/compat/linuxkpi/gplv2/include/linux/reservation.h:104
#28 __i915_gem_free_objects (i915=0xfffffe00025df000, freed=<optimized out>)
    at /usr/src/sys/dev/drm/i915/i915_gem.c:4218
#29 0xffffffff853d7068 in __i915_gem_free_work (work=<optimized out>)
    at /usr/src/sys/dev/drm/i915/i915_gem.c:4251
#30 0xffffffff85588711 in linux_work_fn (context=0xfffffe00025e2068, 
    pending=<optimized out>)
    at /usr/src/sys/compat/linuxkpi/common/src/linux_work.c:243
#31 0xffffffff80abb8ad in taskqueue_run_locked (queue=0xfffff8000717a100)
    at /usr/src/sys/kern/subr_taskqueue.c:463
#32 0xffffffff80abc668 in taskqueue_thread_loop (arg=<optimized out>)
    at /usr/src/sys/kern/subr_taskqueue.c:755
#33 0xffffffff80a28334 in fork_exit (
    callout=0xffffffff80abc5e0 <taskqueue_thread_loop>, 
    arg=0xfffff80009d45440, frame=0xfffffe0232d50ac0)
    at /usr/src/sys/kern/kern_fork.c:1038
#34 <signal handler called>
(kgdb) 
vishwin commented 6 years ago

This seems to only happen when using x11-drivers/xf86-video-intel; software rendering with the modesetting driver hasn't resulted in a panic like this, yet.

nomadlogic commented 6 years ago

If possible can you verify of glamorgl is enabled when using modesetting? In my experience modesetting+glamorgl tends to be pretty stable compared to the intel driver which is deprecated upstream by Xorg I believe.

For example, if I remove my pre-existing xorg.conf configuration X will autodetect things and setup modesetting+glamorgl correctly.

vishwin commented 6 years ago

This is without any xorg.conf at all. glamoregl doesn't seem to load though:

[  3180.216] (II) Loading sub module "glamoregl"
[  3180.216] (II) LoadModule: "glamoregl"
[  3180.217] (II) Loading /usr/local/lib/xorg/modules/libglamoregl.so
[  3180.222] (II) Module glamoregl: vendor="X.Org Foundation"
[  3180.222]    compiled for 1.18.4, module version = 1.0.0
[  3180.222]    ABI class: X.Org ANSI C Emulation, version 0.4
[  3180.222] (II) glamor: OpenGL accelerated X.org driver based.
[  3180.233] (EE) modeset(0): eglInitialize() failed
[  3180.233] (EE) modeset(0): glamor initialization failed

Further down the log:

[  3181.703] (II) AIGLX: Screen 0 is not DRI2 capable
[  3181.703] (EE) AIGLX: reverting to software rendering
[  3181.731] (II) AIGLX: enabled GLX_MESA_copy_sub_buffer
[  3181.731] (II) AIGLX: Loaded and initialized swrast
[  3181.731] (II) GLX: Initialized DRISWRAST GL provider for screen 0
nomadlogic commented 6 years ago

hrm weird - dumb question, do you have the mesa-dri pkg installed?

vishwin commented 6 years ago

Yes I do. Pulled in as a dependency of xorg-server actually.

vishwin commented 6 years ago

Just somehow got GLX to work under modesetting, and a few hours later under the same circumstances as the original, bam, kernel panic. No crash log this time; no other way into the system (like SSH) is available like every other kernel panic of this nature. I am sure the crash logs would be the same despite not using xf86-video-intel this time.

jbeich commented 6 years ago

Note that this is the tip of drm-next-4.10 as of this writing. I am sure drm-next is afflicted the same way.

Can you try regular drm-next? I frequently see a similar crash when using VAAPI that only affects drm-next-4.10.

vishwin commented 6 years ago

I'll give it a whirl when drm-next gets resynced with -CURRENT. Haven't had a panic since my last comment, though. I can semi-reliably trigger it by rendering a video in Blender; the panic has occurred just over four hours in.