Closed mmakassikis closed 9 months ago
@mmakassikis Good catch! We need to remove all mnt_want_write/mnt_done_write() anywhere ksmbd_vfs_kern_path_create is used as well as ksmbd_vfs_mkdir(). Can you send the patch to the list for this ?
After removing the mnt_want_write/mnt_drop_write calls in vfs helpers that use ksmbd_vfs_kern_path_create, I tested a simple mkdir using smbclient and I see a different lockdep warning.
I'm not sure what the issue is this time though. Removing the mnt_want_write in ksmbd_vfs_setxattr silences the warning, but it looks like sweeping the issue under the rug.
Any idea what may be wrong ?
addr2line conversions:
lockdep warning:
ksmbd: SMB2 data length 10 offset 120
ksmbd: SMB2 len 130
ksmbd: converted name = dir-0
ksmbd: can not get linux path for dir-0, rc = -2
ksmbd: file does not exist, so creating
ksmbd: creating directory
ksmbd: inherit posix acl failed : -2
======================================================
WARNING: possible circular locking dependency detected
6.6.0-rc5+ #798 Not tainted
------------------------------------------------------
kworker/1:0/22 is trying to acquire lock:
ffff8880063163f8 (sb_writers#5){.+.+}-{0:0}, at: ksmbd_vfs_setxattr+0x38/0xd0
but task is already holding lock:
ffff8880057ca1d0 (&type->i_mutex_dir_key#3/1){+.+.}-{3:3}, at: ksmbd_vfs_path_lookup_locked+0x163/0x2f0
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #1 (&type->i_mutex_dir_key#3/1){+.+.}-{3:3}:
lock_acquire.part.0+0x125/0x2d0
lock_acquire+0x93/0x160
down_write_nested+0x84/0x190
do_rmdir+0x1ad/0x2d0
__x64_sys_rmdir+0x63/0x80
do_syscall_64+0x3c/0x90
entry_SYSCALL_64_after_hwframe+0x6e/0xd8
-> #0 (sb_writers#5){.+.+}-{0:0}:
check_prev_add+0x1c7/0x1500
__lock_acquire+0xecc/0x1060
lock_acquire.part.0+0x125/0x2d0
lock_acquire+0x93/0x160
mnt_want_write+0x49/0x220
ksmbd_vfs_setxattr+0x38/0xd0
ksmbd_vfs_set_dos_attrib_xattr+0xc7/0x110
smb2_new_xattrs+0x186/0x1d0
smb2_open+0x316a/0x3740
__process_request+0x151/0x310
__handle_ksmbd_work+0x33c/0x520
handle_ksmbd_work+0x4a/0xd0
process_one_work+0x4a7/0x980
worker_thread+0x365/0x570
kthread+0x18d/0x1d0
ret_from_fork+0x38/0x70
ret_from_fork_asm+0x1b/0x30
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(&type->i_mutex_dir_key#3/1);
lock(sb_writers#5);
lock(&type->i_mutex_dir_key#3/1);
rlock(sb_writers#5);
*** DEADLOCK ***
3 locks held by kworker/1:0/22:
#0: ffff8880063e4938 ((wq_completion)ksmbd-io){+.+.}-{0:0}, at: process_one_work+0x40e/0x980
#1: ffff8880052efdd0 ((work_completion)(&work->work)){+.+.}-{0:0}, at: process_one_work+0x40e/0x980
#2: ffff8880057ca1d0 (&type->i_mutex_dir_key#3/1){+.+.}-{3:3}, at: ksmbd_vfs_path_lookup_locked+0x163/0x2f0
stack backtrace:
CPU: 1 PID: 22 Comm: kworker/1:0 Not tainted 6.6.0-rc5+ #798
Workqueue: ksmbd-io handle_ksmbd_work
Call Trace:
<TASK>
dump_stack_lvl+0x4f/0x90
dump_stack+0x14/0x20
print_circular_bug+0x138/0x160
check_noncircular+0x292/0x2f0
? __pfx_check_noncircular+0x10/0x10
? mark_held_locks+0x6b/0x90
? __stack_depot_save+0x266/0x370
? add_chain_block+0x2a2/0x4a0
check_prev_add+0x1c7/0x1500
? check_deadlock+0x169/0x3b0
__lock_acquire+0xecc/0x1060
? smb2_open+0x316a/0x3740
? __handle_ksmbd_work+0x33c/0x520
? handle_ksmbd_work+0x4a/0xd0
? __pfx___lock_acquire+0x10/0x10
? __lock_release+0x13f/0x290
? __pfx___lock_release+0x10/0x10
? __pfx_do_raw_spin_lock+0x10/0x10
lock_acquire.part.0+0x125/0x2d0
? ksmbd_vfs_setxattr+0x38/0xd0
? __pfx_lock_acquire.part.0+0x10/0x10
? kasan_set_track+0x29/0x40
? kasan_save_alloc_info+0x1f/0x30
? strlen+0x13/0x50
lock_acquire+0x93/0x160
? ksmbd_vfs_setxattr+0x38/0xd0
mnt_want_write+0x49/0x220
? ksmbd_vfs_setxattr+0x38/0xd0
ksmbd_vfs_setxattr+0x38/0xd0
ksmbd_vfs_set_dos_attrib_xattr+0xc7/0x110
? __pfx_ksmbd_vfs_set_dos_attrib_xattr+0x10/0x10
? generic_fillattr+0x269/0x2f0
smb2_new_xattrs+0x186/0x1d0
? __pfx_smb2_new_xattrs+0x10/0x10
? __pfx_ksmbd_UnixTimeToNT+0x10/0x10
? vfs_getattr+0x36/0x50
smb2_open+0x316a/0x3740
? __pfx_smb2_open+0x10/0x10
? __lock_release+0x13f/0x290
? smb2_validate_credit_charge+0x25d/0x360
? __pfx___lock_release+0x10/0x10
? do_raw_spin_lock+0x127/0x1c0
? __pfx_do_raw_spin_lock+0x10/0x10
? do_raw_spin_unlock+0xac/0x110
? _raw_spin_unlock+0x22/0x50
__process_request+0x151/0x310
__handle_ksmbd_work+0x33c/0x520
? __pfx___handle_ksmbd_work+0x10/0x10
handle_ksmbd_work+0x4a/0xd0
process_one_work+0x4a7/0x980
? __pfx_process_one_work+0x10/0x10
? assign_work+0xe1/0x120
worker_thread+0x365/0x570
? __pfx_worker_thread+0x10/0x10
kthread+0x18d/0x1d0
? __pfx_kthread+0x10/0x10
ret_from_fork+0x38/0x70
? __pfx_kthread+0x10/0x10
ret_from_fork_asm+0x1b/0x30
</TASK>
I'm not sure but smb2_open call inode lock and ksmbd_vfs_setxattr is called in smb2_open. So I think that you can create __ksmbd_vfs_setxattr() not to call mnt_want_write().
ksmbd_vfs_setxattr()
{
mnt_want_write()
__ksmbd_vfs_setxattr()
mnt_done_write
}
@mmakassikis I have checked your patch in mailing list. You will send the patch for ksmbd_vfs_setxattr() ?
@mmakassikis Can you check if problem is improved ?
git clone https://github.com/namjaejeon/ksmbd --branch=lockdep_warn
I think that problem is fixed now. Let me know if you find it.
Hello,
On a mainline kernel Linux 6.6-rc5 (commit id 94f6f0550c) compiled with lockdep, running some workloads triggers a "possible recursive locking detected" warning.
For example, running the smb2.rename test from Samba using the following command
../bin/smbtorture //192.168.1.30/testshare -U testuser%tespass smb2.rename
Partial dmesg log below with the ksmbd logs and the lockdep warning + backtrace.
The locking attempts are:
In both cases, these are calls to
mnt_want_write(path.mnt)
which locks sb_writers withsb_start_write
The call stack is as follows:
There is no call to
mnt_done_write()
(direct or throughdone_path_create()
) so lockdep sees a recursion.Reverting 40b268d38 ("ksmbd: add mnt_want_write to ksmbd vfs functions") silences lockdep