intel / ipu6-drivers

GNU General Public License v2.0
172 stars 52 forks source link

ipu6-psys: Fix possible deadlock with kernel 6.2 #83

Closed jwrdegoede closed 1 year ago

jwrdegoede commented 1 year ago

With kernel 6.2 the following possible deadlock gets reported:

[  765.679269] ====================================================== [  765.679270] WARNING: possible circular locking dependency detected
[  765.679271] 6.2.0-rc5+ #40 Tainted: G           O
[  765.679272] ------------------------------------------------------
[  765.679272] camerasrc0:src/6354 is trying to acquire lock:
[  765.679273] ffff992aa6ad8158 (&mm->mmap_lock){++++}-{3:3}, at: ipu_dma_buf_map+0x16c/0x400 [intel_ipu6_psys]
[  765.679281]
               but task is already holding lock:
[  765.679282] ffff992aa6059dc8 (reservation_ww_class_mutex){+.+.}-{3:3}, at: dma_buf_map_attachment_unlocked+0x3d/0x90
[  765.679286]
               which lock already depends on the new lock.

[  765.679287]
               the existing dependency chain (in reverse order) is:
[  765.679287]
               -> #1 (reservation_ww_class_mutex){+.+.}-{3:3}:
[  765.679289]        __ww_mutex_lock.constprop.0+0xbc/0xfb0
[  765.679292]        ww_mutex_lock_interruptible+0x38/0xa0
[  765.679293]        vm_fault_cpu+0x32/0x1a0 [i915]
[  765.679396]        __do_fault+0x30/0x160
[  765.679399]        do_fault+0x2bf/0x440
[  765.679400]        __handle_mm_fault+0x671/0xfb0
[  765.679401]        handle_mm_fault+0x16b/0x410
[  765.679403]        do_user_addr_fault+0x1e0/0x6b0
[  765.679406]        exc_page_fault+0x7e/0x2b0
[  765.679408]        asm_exc_page_fault+0x22/0x30
[  765.679411]
               -> #0 (&mm->mmap_lock){++++}-{3:3}:
[  765.679412]        __lock_acquire+0x12fd/0x1fd0
[  765.679415]        lock_acquire+0xbf/0x2b0
[  765.679416]        down_read+0x3e/0x50
[  765.679418]        ipu_dma_buf_map+0x16c/0x400 [intel_ipu6_psys]
[  765.679423]        __map_dma_buf+0x1e/0x90
[  765.679424]        dma_buf_map_attachment+0xc3/0x120
[  765.679426]        dma_buf_map_attachment_unlocked+0x47/0x90
[  765.679427]        ipu_psys_mapbuf_locked+0x116/0x430 [intel_ipu6_psys]
[  765.679431]        ipu_psys_ioctl+0x1b1/0x4f0 [intel_ipu6_psys]
[  765.679435]        __x64_sys_ioctl+0x8d/0xd0
[  765.679436]        do_syscall_64+0x58/0x80
[  765.679438]        entry_SYSCALL_64_after_hwframe+0x72/0xdc
[  765.679440]
               other info that might help us debug this:

[  765.679441]  Possible unsafe locking scenario:

[  765.679441]        CPU0                    CPU1
[  765.679442]        ----                    ----
[  765.679442]   lock(reservation_ww_class_mutex);
[  765.679443]                                lock(&mm->mmap_lock);
[  765.679444]                                lock(reservation_ww_class_mutex);
[  765.679445]   lock(&mm->mmap_lock);
[  765.679446]
                *** DEADLOCK ***

This is caused by ipu_psys_mapbuf_locked() calling dma_buf_map_attachment_unlocked() which takes reservation_ww_class_mutex and then calls back into ipu_dma_buf_map() which calls ipu_psys_get_userpages() which takes mm->mmap_lock.

So the IPU6 code takes locks in the following order:

  1. lock(reservation_ww_class_mutex)
  2. lock(&mm->mmap_lock)

But the core memory-management code takes these in the reverse oder causing an ABBA deadlock.

Fix this by moving the ipu_psys_get_userpages() call to attach time (to ipu_dma_buf_attach()) like how other dmabuf code in the kernel does this.

As a bonus this also allows properly propagating the error code from ipu_psys_get_userpages().

jwrdegoede commented 1 year ago

Cc: @mrhpearson, @vicamo