Closed asomers closed 2 days ago
As I can see, it is a lock recursion on a source of copy_file_range(). As I can see the first lock is taken by zfs_freebsd_copy_file_range(), calling zfs_clone_range(), while the second lock is taken by zn_flush_cached_data() called by zfs_clone_range(). Seems we need some reorganization there. The problem should happen only if mmap() and copy_file_range() are mixed on the same file.
The problem should happen only if mmap() and copy_file_range() are mixed on the same file.
Yes, fsx does that. It mixes ordinary writes, mmap writes, and copy_file_range writes on the same file.
It seems zn_flush_cached_data()
takes vnode lock since it's other caller zfs_ioctl()
on FreeBSD unlike Linux does not lock it. I think proper solution would be to lock it there instead, though it is not my strongest area. Meanwhile from another side it seems to help to change zfs_freebsd_copy_file_range()
from LK_EXCLUSIVE
to LK_SHARED
, which should be good by itself.
@asomers I kind of fixed it twice, so would be nice if you could test and review https://github.com/openzfs/zfs/pull/16796 and https://github.com/openzfs/zfs/pull/16797 .
System information
Describe the problem you're observing
I can immediately reproduce a
panic: excl->share
on FreeBSD 15.0-CURRENT with witness enabled by using fsx to copy part of a file to another offset of the same file, withcopy_file_range
. Note thatzpool create
by default enables theblock_cloning
feature. If I disable that withzpool create -o feature@block_cloning=disabled
then I cannot reproduce the crash.Describe how to reproduce the problem
Include any warning/errors/backtraces from the system logs
Links
Downstream bug report: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=282878