Open problame opened 3 years ago
I familiarized myself with what truncate_inode_pages_range
actually does.
ISTM that the following sequence might lead to a deadlock:
fallocate() ... zfs_freesp() zfs_free_range() truncate_inode_pages_range() wait_on_page_bit()
zfs_free_range()
: https://github.com/zfsonlinux/zfs/blob/1a0b4f566c4e2948e8df0ce43880ebf6123bad8c/module/os/linux/zfs/zfs_znode.c#L1608znode
's i_mapping->writepages
which is zpl_writepages()
.
That calls zpl_putpage() zfs_putpage()
which also wants to enter the rangelock: https://github.com/zfsonlinux/zfs/blob/1c2358c12a673759845f70c57dade601cc12ed99/module/os/linux/zfs/zfs_vnops_os.c#L3522-L3523@problame your analysis looks correct to me, that's exactly how this deadlock can be hit. Perhaps one reasonable way to handle this is to just release the range lock in zfs_free_range()
prior to truncating the page cache pages. The comment here regarding why we're taking the range lock seems overly strict. While it's true we need to keep the ARC and page cache in sync this is already handled in zfs_read()
and zfs_write()
. Both of which are careful to take the needed page locks when accessing/updating their respective memory mapped pages. Furthermore, any memory mapped page writes which happen concurrent with the truncate should behave exactly as before. They will be written when the dirtied address space in written out via zpl_writepages()
.
While it's true we need to keep the ARC and page cache in sync this is already handled in zfs_read() and zfs_write() Both of which are careful to take the needed page locks when accessing/updating their respective memory mapped pages.
Just to be clear, you are referring to this code?
Furthermore, any memory mapped page writes which happen concurrent with the truncate should behave exactly as before. They will be written when the dirtied address space in written out via zpl_writepages().
Seems reasonable. A truncate
syscall would call truncate_inode_pages_range()
so things are kept in sync.
Just to be clear, you are referring to this code?
Yes, and here for reads:
This issue has been automatically marked as "stale" because it has not had any activity for a while. It will be closed in 90 days if no further activity occurs. Thank you for your contributions.
I wouldn't be surprised if this is still an issue. Don't have a setup handy to repro this. @behlendorf would it be worth investigating and fixing this?
Yes, I suspect this is still an issue and one worth investigating. Particularly if we have a consistent reproducer.
ZFS
master
deadlocks when running xfstests testgeneric/013
.dmesg
for i in $(pgrep fsstress); do cat /proc/$i/stack; done
Analysis
zil_commit_impl()
calls are blocked waiting on a mutex, presumablezl_lock
. They showed up over time and are presumably irrelevant for this bug.Describe how to reproduce the problem
xfstests
from https://github.com/kdave/xfstests.gitmake
.local.config
in the checkout directory