Closed EchterAgo closed 1 year ago
# Child-SP RetAddr Call Site
00 ffff8202`30ba9460 fffff803`4a41bca0 nt!KiSwapContext+0x76
01 ffff8202`30ba95a0 fffff803`4a41b1cf nt!KiSwapThread+0x500
02 ffff8202`30ba9650 fffff803`4a4f7eee nt!KiCommitThreadWait+0x14f
03 ffff8202`30ba96f0 fffff803`54068eba nt!KeWaitForMultipleObjects+0x2be
04 ffff8202`30ba9800 fffff803`5410504d OpenZFS!spl_cv_wait+0xea [H:\dev\openzfs\module\os\windows\spl\spl-condvar.c @ 120]
05 ffff8202`30ba9890 fffff803`54105745 OpenZFS!rrw_enter_write+0xed [H:\dev\openzfs\module\zfs\rrwlock.c @ 219]
06 ffff8202`30ba98d0 fffff803`541056ab OpenZFS!rrm_enter_write+0x35 [H:\dev\openzfs\module\zfs\rrwlock.c @ 371]
07 ffff8202`30ba9910 fffff803`54380cfd OpenZFS!rrm_enter+0x3b [H:\dev\openzfs\module\zfs\rrwlock.c @ 348]
08 ffff8202`30ba9950 fffff803`54380b19 OpenZFS!zfsvfs_teardown+0xcd [H:\dev\openzfs\module\os\windows\zfs\zfs_vfsops.c @ 1458]
09 ffff8202`30ba99b0 fffff803`543c868f OpenZFS!zfs_vfs_unmount+0x209 [H:\dev\openzfs\module\os\windows\zfs\zfs_vfsops.c @ 1653]
0a ffff8202`30ba9b40 fffff803`5437c575 OpenZFS!zfs_windows_unmount+0x41f [H:\dev\openzfs\module\os\windows\zfs\zfs_vnops_windows_mount.c @ 1581]
0b ffff8202`30baa430 fffff803`540838d6 OpenZFS!zfs_ioc_unmount+0x55 [H:\dev\openzfs\module\os\windows\zfs\zfs_ioctl_os.c @ 916]
0c ffff8202`30baa470 fffff803`5437c3a5 OpenZFS!zfsdev_ioctl_common+0x816 [H:\dev\openzfs\module\zfs\zfs_ioctl.c @ 7866]
0d ffff8202`30baa550 fffff803`5435f06d OpenZFS!zfsdev_ioctl+0x2c5 [H:\dev\openzfs\module\os\windows\zfs\zfs_ioctl_os.c @ 866]
0e ffff8202`30baa640 fffff803`5435e976 OpenZFS!ioctlDispatcher+0x32d [H:\dev\openzfs\module\os\windows\zfs\zfs_vnops_windows.c @ 6409]
0f ffff8202`30baa710 fffff803`4a410665 OpenZFS!dispatcher+0x1e6 [H:\dev\openzfs\module\os\windows\zfs\zfs_vnops_windows.c @ 7321]
10 ffff8202`30baa800 fffff803`4a80142c nt!IofCallDriver+0x55
11 ffff8202`30baa840 fffff803`4a801081 nt!IopSynchronousServiceTail+0x34c
12 ffff8202`30baa8e0 fffff803`4a8003f6 nt!IopXxxControlFile+0xc71
13 ffff8202`30baaa20 fffff803`4a610ef8 nt!NtDeviceIoControlFile+0x56
14 ffff8202`30baaa90 00007ffa`f122d0c4 nt!KiSystemServiceCopyEnd+0x28
15 000000fb`a09fcee8 00000000`00000000 0x00007ffa`f122d0c4
OK, so it grabbed vfs_busy(), then goes and waits for ZFS_TEARDOWN_ENTER_WRITE(zfsvfs, FTAG);
yeah that's not going to work
so we need to remove the vfs_busy() calls from unmount rats, will need to rethink this
side note to self: apparently "System Volume Information" directory is tagged HIDDEN - add logic to dirlist to skip HIDDEN attributes.
yeah we should remove the vfs_busy() stuff from the 3 acquirelazy/read/fastio. if there are still issues there, they need other options
side note to self: apparently "System Volume Information" directory is tagged HIDDEN - add logic to dirlist to skip HIDDEN attributes.
Nope, chatgpt says to always return everything, Explorer will skip over HIDDEN, which checks out.
With cab2a207eccf2666e77c981de345f8bcf3b3125c I get a XΛ#øÿÿ: mutex not m_initialised
# Child-SP RetAddr Call Site
00 ffff870e`668845f0 fffff802`23497fec OpenZFS!panic+0x3c [H:\dev\openzfs\module\os\windows\spl\spl-debug.c @ 32]
01 ffff870e`66884630 fffff802`23498ec8 OpenZFS!spl_mutex_enter+0x3c [H:\dev\openzfs\module\os\windows\spl\spl-mutex.c @ 119]
02 ffff870e`668846a0 fffff802`23534e7d OpenZFS!spl_cv_wait+0xf8 [H:\dev\openzfs\module\os\windows\spl\spl-condvar.c @ 127]
03 ffff870e`66884730 fffff802`23534d1f OpenZFS!rrw_enter_read_impl+0x14d [H:\dev\openzfs\module\zfs\rrwlock.c @ 178]
04 ffff870e`66884780 fffff802`23535709 OpenZFS!rrw_enter_read+0x1f [H:\dev\openzfs\module\zfs\rrwlock.c @ 198]
05 ffff870e`668847c0 fffff802`2377d2e4 OpenZFS!rrm_enter_read+0x49 [H:\dev\openzfs\module\zfs\rrwlock.c @ 364]
06 ffff870e`66884800 fffff802`2377d1be OpenZFS!zfs_enter+0x24 [H:\dev\openzfs\include\os\windows\zfs\sys\zfs_znode_impl.h @ 149]
07 ffff870e`66884840 fffff802`184579e0 OpenZFS!zfs_AcquireForLazyWrite+0x13e [H:\dev\openzfs\module\os\windows\zfs\zfs_vnops_windows.c @ 140]
08 ffff870e`668848c0 fffff802`185057b1 nt!CcWriteBehindInternal+0x130
09 ffff870e`668849a0 fffff802`18502fe1 nt!CcWriteBehind+0x91
0a ffff870e`66884a90 fffff802`18450545 nt!CcCachemapUninitWorkerThread+0xf1
0b ffff870e`66884b70 fffff802`1850e6f5 nt!ExpWorkerThread+0x105
0c ffff870e`66884c10 fffff802`18606278 nt!PspSystemThreadStartup+0x55
0d ffff870e`66884c60 00000000`00000000 nt!KiStartSystemThread+0x28
This is here in zfs_AcquireForLazyWrite
:
if (zfsvfs->z_unmounted ||
zfs_enter(zfsvfs, FTAG) != 0) {
When I check, zfsvfs->z_unmounted
is already TRUE
, so it was set after the mutex has been destroyed? I suspect this is also what crashed my GitHub actions runner in tests.py
earlier even though it ran completely stable before.
zfs_freevfs
and zfsvfs_free
were called just before the crash. I think in zfs_vfs_unmount
we need to set zfsvfs->z_unmounted = B_TRUE;
before the zfs_freevfs
at the end.
What is also curious is that in zfs_AcquireForLazyWrite
zfsvfs
is not NULL, but when checking zmo->fsprivate
it is already zeroed, so the zeroing must have happened in between line 131 and 141.
Even if we moved zfsvfs->z_unmounted = B_TRUE
in zfs_vfs_unmount
we'd still potentially be accessing freed memory.
Another curious thing I found is:
FFFFA162A455D080: dprintf: zfs_vfsops.c:2062:zfs_freevfs(): +freevfs
FFFFA162A455D080: dprintf: zfs_vfsops.c:879:zfsvfs_free(): +zfsvfs_free
yet, there are no corresponding -freevfs
/ -zfsvfs_free
, so it must have happened in the middle of freeing those.
It must have happened when zfsvfs_free
was still before the Unloading hardlink AVLtree
print
To summarize:
zfs_AcquireForLazyWrite
gets called with vfs_fsprivate(zmo) != NULL
fsprivate=NULL
, frees mutexzfs_AcquireForLazyWrite
continues execution but crashes because of freed mutex.The unmount thread is in zfsvfs_free
just after ZFS_TEARDOWN_DESTROY(zfsvfs);
yeah it needs a solution, even a tryenter would work.
Ah ok, I should have read the XNU sources better: https://github.com/apple/darwin-xnu/blob/2ff845c2e033bd0ff64b5b6aa6063a1f8f65aa32/bsd/vfs/vfs_subr.c#L973
We should use vfs_busy()
as a rwlock instead, and a call to vfs_busy()
gets a shared lock only. We should also have LK_NOWAIT
flag, and the LazyWrite/ReadAhead/fastio_modwrite should use vfs_busy(..., LK_NOWAIT)
.
Then in the unmount code, after the getzfsvfs()
which gets a sharedlock, we should upgrade to exclusive.
I have double turnover today, so I might not be able to get that code done until tomorrow.
But how will that help us if the zfsvfs
is potentially already freed? There is a good chance that whatever we do works but then breaks if the memory usage pattern changes. Do we need a lock in mount
?
I created #303 to continue discussion of this and leave this issue for the memory leak issue.
With the latest changes (c8dbc2546cb619097df33ad5ad3a1ff5d18c9577) I still get the hang at unmount, same stack trace as in https://github.com/openzfsonwindows/openzfs/issues/283#issuecomment-1770383934 other than the line number in zfs_vnops_windows_mount.c
:
0: kd> dt OpenZFS!vfs_main_lock
+0x000 rw_lock : _ERESOURCE
+0x068 rw_owner : 0xffffaae4`87474080 Void
+0x070 rw_readers : 0n0
+0x074 rw_pad : 0n305419896
0: kd> .thread ffffaae487474080
Implicit thread is now ffffaae4`87474080
0: kd> k
*** Stack trace for last set context - .thread/.cxr resets it
# Child-SP RetAddr Call Site
00 fffff282`db2ad460 fffff802`2441bca0 nt!KiSwapContext+0x76
01 fffff282`db2ad5a0 fffff802`2441b1cf nt!KiSwapThread+0x500
02 fffff282`db2ad650 fffff802`244f7eee nt!KiCommitThreadWait+0x14f
03 fffff282`db2ad6f0 fffff802`2ee58eba nt!KeWaitForMultipleObjects+0x2be
04 fffff282`db2ad800 fffff802`2eef521d OpenZFS!spl_cv_wait+0xea [H:\dev\openzfs\module\os\windows\spl\spl-condvar.c @ 120]
05 fffff282`db2ad890 fffff802`2eef5915 OpenZFS!rrw_enter_write+0xed [H:\dev\openzfs\module\zfs\rrwlock.c @ 219]
06 fffff282`db2ad8d0 fffff802`2eef587b OpenZFS!rrm_enter_write+0x35 [H:\dev\openzfs\module\zfs\rrwlock.c @ 371]
07 fffff282`db2ad910 fffff802`2f170ecd OpenZFS!rrm_enter+0x3b [H:\dev\openzfs\module\zfs\rrwlock.c @ 348]
08 fffff282`db2ad950 fffff802`2f170ce9 OpenZFS!zfsvfs_teardown+0xcd [H:\dev\openzfs\module\os\windows\zfs\zfs_vfsops.c @ 1458]
09 fffff282`db2ad9b0 fffff802`2f1b88c1 OpenZFS!zfs_vfs_unmount+0x209 [H:\dev\openzfs\module\os\windows\zfs\zfs_vfsops.c @ 1653]
0a fffff282`db2adb40 fffff802`2f16c745 OpenZFS!zfs_windows_unmount+0x481 [H:\dev\openzfs\module\os\windows\zfs\zfs_vnops_windows_mount.c @ 1595]
0b fffff282`db2ae430 fffff802`2ee73aa6 OpenZFS!zfs_ioc_unmount+0x55 [H:\dev\openzfs\module\os\windows\zfs\zfs_ioctl_os.c @ 916]
0c fffff282`db2ae470 fffff802`2f16c575 OpenZFS!zfsdev_ioctl_common+0x816 [H:\dev\openzfs\module\zfs\zfs_ioctl.c @ 7866]
0d fffff282`db2ae550 fffff802`2f14f23d OpenZFS!zfsdev_ioctl+0x2c5 [H:\dev\openzfs\module\os\windows\zfs\zfs_ioctl_os.c @ 866]
0e fffff282`db2ae640 fffff802`2f14eb46 OpenZFS!ioctlDispatcher+0x32d [H:\dev\openzfs\module\os\windows\zfs\zfs_vnops_windows.c @ 6409]
0f fffff282`db2ae710 fffff802`24410665 OpenZFS!dispatcher+0x1e6 [H:\dev\openzfs\module\os\windows\zfs\zfs_vnops_windows.c @ 7321]
10 fffff282`db2ae800 fffff802`2480142c nt!IofCallDriver+0x55
11 fffff282`db2ae840 fffff802`24801081 nt!IopSynchronousServiceTail+0x34c
12 fffff282`db2ae8e0 fffff802`248003f6 nt!IopXxxControlFile+0xc71
13 fffff282`db2aea20 fffff802`24610ef8 nt!NtDeviceIoControlFile+0x56
14 fffff282`db2aea90 00007ffa`01f8d0c4 nt!KiSystemServiceCopyEnd+0x28
15 000000c8`600fc868 00000000`00000000 0x00007ffa`01f8d0c4
I noticed now it happens even after running just rclone for a very short time on a simple mount with no other tests running, so I can easily reproduce this now.
I'll try to get a reproducer for this.
Also, if you need any more info from this crash, I haven't restarted the machine yet.
Hmmm so what is actually happening. Clearly the thread you pasted in waiting for WRITER lock in teardown, which should be well inside WRITER lock of zfs_windows_unmount.
So what thread is holding rrw_enter_write+0xed [H:\dev\openzfs\module\zfs\rrwlock.c @ 219] ?
rrm_enter(&(zfsvfs)->z_teardown_lock, RW_WRITER, tag)
I don't think I understand what you mean. The rrl->rr_lock
in rrw_enter_write
? :
5: kd> dt rrl
Local var @ 0xfffff282db2ad8c0 Type rrwlock*
0xffffa583`e2d667b8
+0x000 rr_lock : kmutex
+0x030 rr_cv : cv
+0x070 rr_writer : (null)
+0x078 rr_anon_rcount : refcount
+0x0f8 rr_linked_rcount : refcount
+0x178 rr_writer_wanted : 1 ( B_TRUE )
+0x17c rr_track_all : 0 ( B_FALSE )
5: kd> dt kmutex poi(rrl)
OpenZFS!kmutex
+0x000 m_lock : mutex_t
+0x018 m_owner : (null)
+0x020 m_destroy_lock : 0
+0x028 m_initialised : 0x23456789
5: kd> dt -b rrl
Local var @ 0xfffff282db2ad8c0 Type rrwlock*
0xffffa583`e2d667b8
+0x000 rr_lock : kmutex
+0x000 m_lock : mutex_t
+0x000 opaque : _KEVENT
+0x000 Header : _DISPATCHER_HEADER
+0x000 Lock : 0n393217
+0x000 LockNV : 0n393217
+0x000 Type : 0x1 ''
+0x001 Signalling : 0 ''
+0x002 Size : 0x6 ''
+0x003 Reserved1 : 0 ''
+0x000 TimerType : 0x1 ''
+0x001 TimerControlFlags : 0 ''
+0x001 Absolute : 0y0
+0x001 Wake : 0y0
+0x001 EncodedTolerableDelay : 0y000000 (0)
+0x002 Hand : 0x6 ''
+0x003 TimerMiscFlags : 0 ''
+0x003 Index : 0y000000 (0)
+0x003 Inserted : 0y0
+0x003 Expired : 0y0
+0x000 Timer2Type : 0x1 ''
+0x001 Timer2Flags : 0 ''
+0x001 Timer2Inserted : 0y0
+0x001 Timer2Expiring : 0y0
+0x001 Timer2CancelPending : 0y0
+0x001 Timer2SetPending : 0y0
+0x001 Timer2Running : 0y0
+0x001 Timer2Disabled : 0y0
+0x001 Timer2ReservedFlags : 0y00
+0x002 Timer2ComponentId : 0x6 ''
+0x003 Timer2RelativeId : 0 ''
+0x000 QueueType : 0x1 ''
+0x001 QueueControlFlags : 0 ''
+0x001 Abandoned : 0y0
+0x001 DisableIncrement : 0y0
+0x001 QueueReservedControlFlags : 0y000000 (0)
+0x002 QueueSize : 0x6 ''
+0x003 QueueReserved : 0 ''
+0x000 ThreadType : 0x1 ''
+0x001 ThreadReserved : 0 ''
+0x002 ThreadControlFlags : 0x6 ''
+0x002 CycleProfiling : 0y0
+0x002 CounterProfiling : 0y1
+0x002 GroupScheduling : 0y1
+0x002 AffinitySet : 0y0
+0x002 Tagged : 0y0
+0x002 EnergyProfiling : 0y0
+0x002 SchedulerAssist : 0y0
+0x002 ThreadReservedControlFlags : 0y0
+0x003 DebugActive : 0 ''
+0x003 ActiveDR7 : 0y0
+0x003 Instrumented : 0y0
+0x003 Minimal : 0y0
+0x003 Reserved4 : 0y00
+0x003 AltSyscall : 0y0
+0x003 Emulation : 0y0
+0x003 Reserved5 : 0y0
+0x000 MutantType : 0x1 ''
+0x001 MutantSize : 0 ''
+0x002 DpcActive : 0x6 ''
+0x003 MutantReserved : 0 ''
+0x004 SignalState : 0n1
+0x008 WaitListHead : _LIST_ENTRY [ 0xffffa583`e2d667c0 - 0xffffa583`e2d667c0 ]
+0x000 Flink : 0xffffa583`e2d667c0
+0x008 Blink : 0xffffa583`e2d667c0
+0x018 m_owner : (null)
+0x020 m_destroy_lock : 0
+0x028 m_initialised : 0x23456789
+0x030 rr_cv : cv
+0x000 cv_kevent :
[00] _KEVENT
+0x000 Header : _DISPATCHER_HEADER
+0x000 Lock : 0n393217
+0x000 LockNV : 0n393217
+0x000 Type : 0x1 ''
+0x001 Signalling : 0 ''
+0x002 Size : 0x6 ''
+0x003 Reserved1 : 0 ''
+0x000 TimerType : 0x1 ''
+0x001 TimerControlFlags : 0 ''
+0x001 Absolute : 0y0
+0x001 Wake : 0y0
+0x001 EncodedTolerableDelay : 0y000000 (0)
+0x002 Hand : 0x6 ''
+0x003 TimerMiscFlags : 0 ''
+0x003 Index : 0y000000 (0)
+0x003 Inserted : 0y0
+0x003 Expired : 0y0
+0x000 Timer2Type : 0x1 ''
+0x001 Timer2Flags : 0 ''
+0x001 Timer2Inserted : 0y0
+0x001 Timer2Expiring : 0y0
+0x001 Timer2CancelPending : 0y0
+0x001 Timer2SetPending : 0y0
+0x001 Timer2Running : 0y0
+0x001 Timer2Disabled : 0y0
+0x001 Timer2ReservedFlags : 0y00
+0x002 Timer2ComponentId : 0x6 ''
+0x003 Timer2RelativeId : 0 ''
+0x000 QueueType : 0x1 ''
+0x001 QueueControlFlags : 0 ''
+0x001 Abandoned : 0y0
+0x001 DisableIncrement : 0y0
+0x001 QueueReservedControlFlags : 0y000000 (0)
+0x002 QueueSize : 0x6 ''
+0x003 QueueReserved : 0 ''
+0x000 ThreadType : 0x1 ''
+0x001 ThreadReserved : 0 ''
+0x002 ThreadControlFlags : 0x6 ''
+0x002 CycleProfiling : 0y0
+0x002 CounterProfiling : 0y1
+0x002 GroupScheduling : 0y1
+0x002 AffinitySet : 0y0
+0x002 Tagged : 0y0
+0x002 EnergyProfiling : 0y0
+0x002 SchedulerAssist : 0y0
+0x002 ThreadReservedControlFlags : 0y0
+0x003 DebugActive : 0 ''
+0x003 ActiveDR7 : 0y0
+0x003 Instrumented : 0y0
+0x003 Minimal : 0y0
+0x003 Reserved4 : 0y00
+0x003 AltSyscall : 0y0
+0x003 Emulation : 0y0
+0x003 Reserved5 : 0y0
+0x000 MutantType : 0x1 ''
+0x001 MutantSize : 0 ''
+0x002 DpcActive : 0x6 ''
+0x003 MutantReserved : 0 ''
+0x004 SignalState : 0n0
+0x008 WaitListHead : _LIST_ENTRY [ 0xffffaae4`874741c0 - 0xffffaae4`874741c0 ]
+0x000 Flink : 0xffffaae4`874741c0
+0x008 Blink : 0xffffaae4`874741c0
[01]
+0x000 Header : _DISPATCHER_HEADER
+0x000 Lock : 0n393216
+0x000 LockNV : 0n393216
+0x000 Type : 0 ''
+0x001 Signalling : 0 ''
+0x002 Size : 0x6 ''
+0x003 Reserved1 : 0 ''
+0x000 TimerType : 0 ''
+0x001 TimerControlFlags : 0 ''
+0x001 Absolute : 0y0
+0x001 Wake : 0y0
+0x001 EncodedTolerableDelay : 0y000000 (0)
+0x002 Hand : 0x6 ''
+0x003 TimerMiscFlags : 0 ''
+0x003 Index : 0y000000 (0)
+0x003 Inserted : 0y0
+0x003 Expired : 0y0
+0x000 Timer2Type : 0 ''
+0x001 Timer2Flags : 0 ''
+0x001 Timer2Inserted : 0y0
+0x001 Timer2Expiring : 0y0
+0x001 Timer2CancelPending : 0y0
+0x001 Timer2SetPending : 0y0
+0x001 Timer2Running : 0y0
+0x001 Timer2Disabled : 0y0
+0x001 Timer2ReservedFlags : 0y00
+0x002 Timer2ComponentId : 0x6 ''
+0x003 Timer2RelativeId : 0 ''
+0x000 QueueType : 0 ''
+0x001 QueueControlFlags : 0 ''
+0x001 Abandoned : 0y0
+0x001 DisableIncrement : 0y0
+0x001 QueueReservedControlFlags : 0y000000 (0)
+0x002 QueueSize : 0x6 ''
+0x003 QueueReserved : 0 ''
+0x000 ThreadType : 0 ''
+0x001 ThreadReserved : 0 ''
+0x002 ThreadControlFlags : 0x6 ''
+0x002 CycleProfiling : 0y0
+0x002 CounterProfiling : 0y1
+0x002 GroupScheduling : 0y1
+0x002 AffinitySet : 0y0
+0x002 Tagged : 0y0
+0x002 EnergyProfiling : 0y0
+0x002 SchedulerAssist : 0y0
+0x002 ThreadReservedControlFlags : 0y0
+0x003 DebugActive : 0 ''
+0x003 ActiveDR7 : 0y0
+0x003 Instrumented : 0y0
+0x003 Minimal : 0y0
+0x003 Reserved4 : 0y00
+0x003 AltSyscall : 0y0
+0x003 Emulation : 0y0
+0x003 Reserved5 : 0y0
+0x000 MutantType : 0 ''
+0x001 MutantSize : 0 ''
+0x002 DpcActive : 0x6 ''
+0x003 MutantReserved : 0 ''
+0x004 SignalState : 0n0
+0x008 WaitListHead : _LIST_ENTRY [ 0xffffaae4`874741f0 - 0xffffaae4`874741f0 ]
+0x000 Flink : 0xffffaae4`874741f0
+0x008 Blink : 0xffffaae4`874741f0
+0x030 cv_waiters_count_lock : 0
+0x038 cv_waiters_count : 1
+0x03c cv_initialised : 0x12345678
+0x070 rr_writer : (null)
+0x078 rr_anon_rcount : refcount
+0x000 rc_count : 0x12
+0x008 rc_mtx : kmutex
+0x000 m_lock : mutex_t
+0x000 opaque : _KEVENT
+0x000 Header : _DISPATCHER_HEADER
+0x000 Lock : 0n393217
+0x000 LockNV : 0n393217
+0x000 Type : 0x1 ''
+0x001 Signalling : 0 ''
+0x002 Size : 0x6 ''
+0x003 Reserved1 : 0 ''
+0x000 TimerType : 0x1 ''
+0x001 TimerControlFlags : 0 ''
+0x001 Absolute : 0y0
+0x001 Wake : 0y0
+0x001 EncodedTolerableDelay : 0y000000 (0)
+0x002 Hand : 0x6 ''
+0x003 TimerMiscFlags : 0 ''
+0x003 Index : 0y000000 (0)
+0x003 Inserted : 0y0
+0x003 Expired : 0y0
+0x000 Timer2Type : 0x1 ''
+0x001 Timer2Flags : 0 ''
+0x001 Timer2Inserted : 0y0
+0x001 Timer2Expiring : 0y0
+0x001 Timer2CancelPending : 0y0
+0x001 Timer2SetPending : 0y0
+0x001 Timer2Running : 0y0
+0x001 Timer2Disabled : 0y0
+0x001 Timer2ReservedFlags : 0y00
+0x002 Timer2ComponentId : 0x6 ''
+0x003 Timer2RelativeId : 0 ''
+0x000 QueueType : 0x1 ''
+0x001 QueueControlFlags : 0 ''
+0x001 Abandoned : 0y0
+0x001 DisableIncrement : 0y0
+0x001 QueueReservedControlFlags : 0y000000 (0)
+0x002 QueueSize : 0x6 ''
+0x003 QueueReserved : 0 ''
+0x000 ThreadType : 0x1 ''
+0x001 ThreadReserved : 0 ''
+0x002 ThreadControlFlags : 0x6 ''
+0x002 CycleProfiling : 0y0
+0x002 CounterProfiling : 0y1
+0x002 GroupScheduling : 0y1
+0x002 AffinitySet : 0y0
+0x002 Tagged : 0y0
+0x002 EnergyProfiling : 0y0
+0x002 SchedulerAssist : 0y0
+0x002 ThreadReservedControlFlags : 0y0
+0x003 DebugActive : 0 ''
+0x003 ActiveDR7 : 0y0
+0x003 Instrumented : 0y0
+0x003 Minimal : 0y0
+0x003 Reserved4 : 0y00
+0x003 AltSyscall : 0y0
+0x003 Emulation : 0y0
+0x003 Reserved5 : 0y0
+0x000 MutantType : 0x1 ''
+0x001 MutantSize : 0 ''
+0x002 DpcActive : 0x6 ''
+0x003 MutantReserved : 0 ''
+0x004 SignalState : 0n1
+0x008 WaitListHead : _LIST_ENTRY [ 0xffffa583`e2d66840 - 0xffffa583`e2d66840 ]
+0x000 Flink : 0xffffa583`e2d66840
+0x008 Blink : 0xffffa583`e2d66840
+0x018 m_owner : (null)
+0x020 m_destroy_lock : 0
+0x028 m_initialised : 0x23456789
+0x038 rc_tree : avl_tree
+0x000 avl_root : (null)
+0x008 avl_compar : 0xfffff802`2efa36a0
+0x010 avl_offset : 0
+0x014 avl_numnodes : 0
+0x018 avl_size : 0
+0x058 rc_removed : list
+0x000 list_size : 0x30
+0x008 list_offset : 0
+0x010 list_head : list_node
+0x000 list_next : 0xffffa583`e2d66898
+0x008 list_prev : 0xffffa583`e2d66898
+0x078 rc_removed_count : 0
+0x07c rc_tracked : 0 ( B_FALSE )
+0x0f8 rr_linked_rcount : refcount
+0x000 rc_count : 0
+0x008 rc_mtx : kmutex
+0x000 m_lock : mutex_t
+0x000 opaque : _KEVENT
+0x000 Header : _DISPATCHER_HEADER
+0x000 Lock : 0n393217
+0x000 LockNV : 0n393217
+0x000 Type : 0x1 ''
+0x001 Signalling : 0 ''
+0x002 Size : 0x6 ''
+0x003 Reserved1 : 0 ''
+0x000 TimerType : 0x1 ''
+0x001 TimerControlFlags : 0 ''
+0x001 Absolute : 0y0
+0x001 Wake : 0y0
+0x001 EncodedTolerableDelay : 0y000000 (0)
+0x002 Hand : 0x6 ''
+0x003 TimerMiscFlags : 0 ''
+0x003 Index : 0y000000 (0)
+0x003 Inserted : 0y0
+0x003 Expired : 0y0
+0x000 Timer2Type : 0x1 ''
+0x001 Timer2Flags : 0 ''
+0x001 Timer2Inserted : 0y0
+0x001 Timer2Expiring : 0y0
+0x001 Timer2CancelPending : 0y0
+0x001 Timer2SetPending : 0y0
+0x001 Timer2Running : 0y0
+0x001 Timer2Disabled : 0y0
+0x001 Timer2ReservedFlags : 0y00
+0x002 Timer2ComponentId : 0x6 ''
+0x003 Timer2RelativeId : 0 ''
+0x000 QueueType : 0x1 ''
+0x001 QueueControlFlags : 0 ''
+0x001 Abandoned : 0y0
+0x001 DisableIncrement : 0y0
+0x001 QueueReservedControlFlags : 0y000000 (0)
+0x002 QueueSize : 0x6 ''
+0x003 QueueReserved : 0 ''
+0x000 ThreadType : 0x1 ''
+0x001 ThreadReserved : 0 ''
+0x002 ThreadControlFlags : 0x6 ''
+0x002 CycleProfiling : 0y0
+0x002 CounterProfiling : 0y1
+0x002 GroupScheduling : 0y1
+0x002 AffinitySet : 0y0
+0x002 Tagged : 0y0
+0x002 EnergyProfiling : 0y0
+0x002 SchedulerAssist : 0y0
+0x002 ThreadReservedControlFlags : 0y0
+0x003 DebugActive : 0 ''
+0x003 ActiveDR7 : 0y0
+0x003 Instrumented : 0y0
+0x003 Minimal : 0y0
+0x003 Reserved4 : 0y00
+0x003 AltSyscall : 0y0
+0x003 Emulation : 0y0
+0x003 Reserved5 : 0y0
+0x000 MutantType : 0x1 ''
+0x001 MutantSize : 0 ''
+0x002 DpcActive : 0x6 ''
+0x003 MutantReserved : 0 ''
+0x004 SignalState : 0n1
+0x008 WaitListHead : _LIST_ENTRY [ 0xffffa583`e2d668c0 - 0xffffa583`e2d668c0 ]
+0x000 Flink : 0xffffa583`e2d668c0
+0x008 Blink : 0xffffa583`e2d668c0
+0x018 m_owner : (null)
+0x020 m_destroy_lock : 0
+0x028 m_initialised : 0x23456789
+0x038 rc_tree : avl_tree
+0x000 avl_root : (null)
+0x008 avl_compar : 0xfffff802`2efa36a0
+0x010 avl_offset : 0
+0x014 avl_numnodes : 0
+0x018 avl_size : 0
+0x058 rc_removed : list
+0x000 list_size : 0x30
+0x008 list_offset : 0
+0x010 list_head : list_node
+0x000 list_next : 0xffffa583`e2d66918
+0x008 list_prev : 0xffffa583`e2d66918
+0x078 rc_removed_count : 0
+0x07c rc_tracked : 0 ( B_FALSE )
+0x178 rr_writer_wanted : 1 ( B_TRUE )
+0x17c rr_track_all : 0 ( B_FALSE )
When I go through the `rrl
parameter passed to rrm_enter
and go through all the 17 rrmlock
all the rr_writer
except two are ffffaae487474080
, the others are 0.
go up until you have easy access to zfsvfs
and take a look at (zfsvfs)->z_teardown_lock
- should have an owner field
ah so sorry its called rrl->rr_writer = curthread;
Ah ok which is probably what you dumped, so we have writer_wanted, but its waiting for the readers to drain.
rrl shows as a struct rrwlock [17]
in the debugger and the rr_writer
of almost all of them are ffffaae487474080
, same as the unmount thread, except two that are 0.
wait appears to be
while (zfs_refcount_count(&rrl->rr_anon_rcount) > 0 ||
zfs_refcount_count(&rrl->rr_linked_rcount) > 0 ||
rrl->rr_writer != NULL) {
rrl->rr_writer_wanted = B_TRUE;
cv_wait(&rrl->rr_cv, &rrl->rr_lock);
rr_anon_rcount
seems to be 0x12, and rr_linked_rcount
is 0. So we are looking at rr_anon_rcount
held.
and that is waiting for cv_wait(&rrl->rr_cv, &rrl->rr_lock);
to wake it up, and looking at mutex rr_lock
+0x018 m_owner : (null)
+0x020 m_destroy_lock : 0
+0x028 m_initialised : 0x23456789
it has been destroyed. ah that is less than ideal
Complicates it that it has RRM_NUM_LOCKS
(17) locks in an array, and it picks the lock to use with
RRM_TD_LOCK() (((uint32_t)(uintptr_t)(curthread)) % RRM_NUM_LOCKS)
So, thread address ffffaae487474080 % 17, is
But you are dumping rrl above, so it should already be [14] - so the rr_lock really is destroyed?
RRM_TD_LOCK
I don't think I can follow, I don't see any RRM_TD_LOCK
used in the functions in the call stack.
But you are dumping rrl above, so it should already be [14] - so the rr_lock really is destroyed?
What shows it as destroyed, isn't 0x23456789 == MUTEX_INITIALISED
?
Ah so it is, phew that makes more sense.
Ok then it seems perhaps we leaks some zfs_enter()
with missing zfs_exit()
calls. Could check into rc_tracked
how that works, and see what reader might be leaked - I have never tried it tho. I'll glance over the zfs_enter()
calls and see if any obvious misses exist.
oh you know what this uses thread%17, and we use zfs_enter() and zfs_exit() in AcquireLazy, and ReleaseLazy - they are not always called by same thread. I'll have to change this around a bit
Using NET for debugging
Opened WinSock 2.0
Waiting to reconnect...
Connected to target 192.168.109.130 on port 50151 on local IP 192.168.109.1.
You can get the target MAC address by running .kdtargetmac command.
Connected to Windows 10 19041 x64 target at (Tue Oct 24 08:26:55.733 2023 (UTC + 7:00)), ptr64 TRUE
Kernel Debugger connection established.
Symbol search path is: srv*
Executable search path is:
Windows 10 Kernel Version 19041 MP (24 procs) Free x64
Product: WinNt, suite: TerminalServer SingleUserTS
Edition build lab: 19041.1.amd64fre.vb_release.191206-1406
Machine Name:
Kernel base = 0xfffff803`4ea00000 PsLoadedModuleList = 0xfffff803`4f62a360
Debug session time: Tue Oct 24 08:25:26.192 2023 (UTC + 7:00)
System Uptime: 0 days 0:00:45.926
Break instruction exception - code 80000003 (first chance)
fffff803`5c813da8 cc int 3
7: kd> !analyze -v
Connected to Windows 10 19041 x64 target at (Tue Oct 24 08:27:04.278 2023 (UTC + 7:00)), ptr64 TRUE
Loading Kernel Symbols
.............................................................A timeout occurred. The timeout can be increased in the Debugging options page
..
.................................
Press ctrl-c (cdb, kd, ntsd) or ctrl-break (windbg) to abort symbol loads that take too long.
Run !sym noisy before .reload to track down problems loading symbols.
...............................
...........................................................
Loading User Symbols
Loading unloaded module list
......
*******************************************************************************
* *
* Bugcheck Analysis *
* *
*******************************************************************************
Unknown bugcheck code (0)
Unknown bugcheck description
Arguments:
Arg1: 0000000000000000
Arg2: 0000000000000000
Arg3: 0000000000000000
Arg4: 0000000000000000
Debugging Details:
------------------
KEY_VALUES_STRING: 1
Key : Analysis.CPU.mSec
Value: 3530
Key : Analysis.DebugAnalysisManager
Value: Create
Key : Analysis.Elapsed.mSec
Value: 263033
Key : Analysis.Init.CPU.mSec
Value: 342
Key : Analysis.Init.Elapsed.mSec
Value: 19571
Key : Analysis.Memory.CommitPeak.Mb
Value: 86
Key : WER.OS.Branch
Value: vb_release
Key : WER.OS.Timestamp
Value: 2019-12-06T14:06:00Z
Key : WER.OS.Version
Value: 10.0.19041.1
BUGCHECK_CODE: 0
BUGCHECK_P1: 0
BUGCHECK_P2: 0
BUGCHECK_P3: 0
BUGCHECK_P4: 0
PROCESS_NAME: System
ERROR_CODE: (NTSTATUS) 0x80000003 - {EXCEPTION} Breakpoint A breakpoint has been reached.
EXCEPTION_CODE_STR: 80000003
EXCEPTION_PARAMETER1: 0000000000000000
STACK_TEXT:
ffffb980`68bbf1a0 fffff803`5c814052 : ffff8388`67752312 fffff803`5c813a23 00000000`00000000 00000000`00000000 : OpenZFS!zfs_refcount_remove_many+0xc8 [H:\dev\openzfs\module\zfs\refcount.c @ 176]
ffffb980`68bbf270 fffff803`5c7653c8 : ffffb980`68bbf460 00000000`00000000 00000000`00000000 fffff803`5c6c8223 : OpenZFS!zfs_refcount_remove+0x22 [H:\dev\openzfs\module\zfs\refcount.c @ 212]
ffffb980`68bbf2b0 fffff803`5c7659da : 00000000`00000000 ffff860f`afce0040 ffffb980`68bbf460 00000000`00000002 : OpenZFS!rrw_exit+0x118 [H:\dev\openzfs\module\zfs\rrwlock.c @ 264]
ffffb980`68bbf2f0 fffff803`5c9ad594 : ffff860f`c4040470 fffff803`5c6dd86b 00000000`00000000 ffff860f`c4040560 : OpenZFS!rrm_exit+0xaa [H:\dev\openzfs\module\zfs\rrwlock.c @ 386]
ffffb980`68bbf340 fffff803`5c9c4b0d : fffff803`5df8fb50 00000000`00000000 00000000`00000000 fffff803`4ec64dd4 : OpenZFS!zfs_exit+0x24 [H:\dev\openzfs\include\os\windows\zfs\sys\zfs_znode_impl.h @ 161]
ffffb980`68bbf380 fffff803`4ecb2798 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : OpenZFS!fastio_release_for_mod_write+0x14d [H:\dev\openzfs\module\os\windows\zfs\zfs_vnops_windows.c @ 8041]
ffffb980`68bbf410 fffff803`4ecb2eac : ffff90b3`ada4e270 00000000`00000000 00000000`00000000 00000000`00000000 : nt!FsRtlReleaseFileForModWrite+0x160
ffffb980`68bbf6f0 fffff803`4edb790b : fffff803`4f650d40 00000000`00000001 ffff860f`f79f1bc0 00000000`00000000 : nt!MiGatherMappedPages+0x2e8
ffffb980`68bbf7b0 fffff803`4ed0e6f5 : ffff860f`afce0040 ffff860f`afce0040 00000000`00000080 000f8067`b8bbbdff : nt!MiMappedPageWriter+0x18b
ffffb980`68bbfc10 fffff803`4ee06278 : ffff9780`dd000180 ffff860f`afce0040 fffff803`4ed0e6a0 00000000`00000000 : nt!PspSystemThreadStartup+0x55
ffffb980`68bbfc60 00000000`00000000 : ffffb980`68bc0000 ffffb980`68bba000 00000000`00000000 00000000`00000000 : nt!KiStartSystemThread+0x28
FAULTING_SOURCE_LINE: H:\dev\openzfs\module\zfs\refcount.c
FAULTING_SOURCE_FILE: H:\dev\openzfs\module\zfs\refcount.c
FAULTING_SOURCE_LINE_NUMBER: 176
FAULTING_SOURCE_CODE:
172: int64_t count;
173:
174: if (likely(!rc->rc_tracked)) {
175: count = atomic_add_64_nv(&(rc)->rc_count, -number);
> 176: ASSERT3S(count, >=, 0);
177: return (count);
178: }
179:
180: s.ref_holder = holder;
181: s.ref_number = number;
SYMBOL_NAME: OpenZFS!zfs_refcount_remove_many+c8
MODULE_NAME: OpenZFS
IMAGE_NAME: OpenZFS.sys
STACK_COMMAND: .cxr; .ecxr ; kb
BUCKET_ID_FUNC_OFFSET: c8
FAILURE_BUCKET_ID: 0x0_OpenZFS!zfs_refcount_remove_many
OS_VERSION: 10.0.19041.1
BUILDLAB_STR: vb_release
OSPLATFORM_TYPE: x64
OSNAME: Windows 10
FAILURE_ID_HASH: {438d92bf-a7cc-a0ee-61eb-0ce61a562b51}
Followup: MachineOwner
---------
just saw this with b6d402c183ab042a821f9910964fe862d9f66c37
aha, confirms we have a leak, or double free. ok I will need to check through them all, starting with fastio_release_for_mod_write
Ah c2e9403 I was trying to debug while in the morning Zoom. Clearly did not go well
Looks like unmount works reliably now. I did an unload to check for leaks and there is some memory leaked but not much. I'll let the test run longer and see.
I noticed that if I create a fresh pool at the moment the issue happens at around 2.2TB transferred, memory usage just sharply rises.
Putting a conditional breakpoint on > 13GB allocated in osif_malloc
somehow is not triggering despite other conditional breakpoints working :\ Setting a breakpoint in osif_malloc
on allocation failure might already be too late, the copy somehow stalled before that, though I know it often does happen.
Now I have the copy stalled, memory almost full and cbuf
contains only vmem_freelist_insert_sort_by_time
lines and the -EB-
marker. I'm dumping stacks, after that I'll see if I can get kstat
output and the driver to unload.
CPU seems idle thoough:
Yeah kmem just stalls, and I am unsure why - its not even taking CPU. My current plan is to take the latest kmem from macOS and move them over. We did fix a bunch of issues over there and it would be nice to be up to date.
I think this also starts happening just when the system starts swapping. Maybe this would be easier to reproduce with lower memory?
its pretty instant on my VM with your example, on a 2G pool at that, not full. So something is quirky, but the old kmem had a lot of "slow down allocs from XNU" throttling, which we have now removed - so the throttling logic might be bugging out, and it thinks it needs to throttle forever (no signal from XNU in Windows).
This did also start somewhere around fd8bf0d2b92d18b818505413bb7dd8e75fc8decd. Does it make sense figuring out in which commit exactly?
if you have the time, it might mean a quicker fix if its a small problem.
OK re-did the kmem/vmem files, it isn't so much changed, just the local changes to detect pressure. Gave me a chance to clean it up a bit, and now future commits from macOS kmem/vmem should apply more easily.
Worth noting it does behave in the same manner, i have confirmed reaper runs, and it reacts to pressure ok. but the last i noticed is that arc_shrink was not doing the full thing. spl_free_pages has garbage values in it
@lundman I'd be happy to test it, but I can't find the change. Did you forget to push it?
I will in a few, just checking if spl_free being 0xffffffe932843 is something obvious
When testing #281 I noticed that when copying a 5TB dataset using rclone it always ends in an allocation failure:
so
ExAllocatePoolWithTag failed
memory.txt
This seems like a new issue because I was still able to copy the full dataset not that long ago.
I'll try to get some
kstat
information. Is it possible to getkstat
info from the debugger when the failure has already happened? I could also try logging it periodically to a file.