Closed EchterAgo closed 1 year ago
I also just noticed from the time in the screenshot that this happened shortly after I started the copy.
With kstat, you can just run it after some use, say 2-3TB of data. Since nothing should "grow forever" inside kmem, leaks tend to stand out quite a bit.
If its not an internal leak, it could be we aren't releasing something between us and windows, like when closing files or whatnot
From the screenshot it looks like it takes about 2-3 minutes for the memory usage to go up significantly enough to cause out of memory, so I'd have to watch it all the time to catch it. I made a script now that logs kstat every 10 seconds so I can see what the last file is.
I ran CrystalDisk.exe, and then grep inuse kstat.txt
but its pretty inconclusive. Biggest is of course the "total allocated from windows".
From what I could find CrystalDiskMark is a wrapper around DiskSpd
https://github.com/ayavilevich/DiskSpdAuto
You might want to look into this if you're planning on making some sort of test.
i am not, just checking if I could make enough IO to show any leaks :)
Of course when I try to reproduce the issue it doesn't happen :\ It ran all night, although a bit slow.
OK so this Source File: H:\dev\openzfs\module\os\windows\spl\spl-seg_kmem.c, line 134
isn't necessarily a problem, I have an ASSERT in there, as I wanted to see when it happens. Generally, all zfs/kmem allocs come into this function, where we allocate a large bit of memory, then kmem will carve it up into magazines etc, and dish it out, internally to ZFS.
We are supposed to detect memory pressure, and if we have pressure, the kmem reaper will walk around, and release memory back to Windows.
So if the pressure detection is poor, or perhaps, a bit slow, then we can be too slow in reaping memory back, and get NULL allocs. kmem can handle NULL allocs, so eventually the ASSERT is to be removed.
You can see the same path for macOS here:
https://github.com/openzfsonosx/openzfs-fork/blob/macOS/module/os/macos/spl/spl-seg_kmem.c#L207
So that we occasionally get NULL isn't indicative of a problem on its own, unless it happens a lot, or quickly. If it keeps happening then we probably do have a leak, if reaping never releases enough memory.
The Windows memory pressure work is here:
You can also issue pressure events with kstat, although I've not tried it on Windows, but the code hasn't changed. This is all code that needs to be tweaked to a good default once it is stable enough for it to matter :) Which I guess we are close to now.
Ok, I reproduced the condition again, and I just managed to stop the debugger just before running out of memory. The copying actually stalled completely. I'll try to get what I can out of the debugger and then restart the machine to get the kstat logs.
OK so this
Source File: H:\dev\openzfs\module\os\windows\spl\spl-seg_kmem.c, line 134
isn't necessarily a problem, I have an ASSERT in there, as I wanted to see when it happens. Generally, all zfs/kmem allocs come into this function, where we allocate a large bit of memory, then kmem will carve it up into magazines etc, and dish it out, internally to ZFS. We are supposed to detect memory pressure, and if we have pressure, the kmem reaper will walk around, and release memory back to Windows. So if the pressure detection is poor, or perhaps, a bit slow, then we can be too slow in reaping memory back, and get NULL allocs. kmem can handle NULL allocs, so eventually the ASSERT is to be removed. You can see the same path for macOS here: https://github.com/openzfsonosx/openzfs-fork/blob/macOS/module/os/macos/spl/spl-seg_kmem.c#L207So that we occasionally get NULL isn't indicative of a problem on its own, unless it happens a lot, or quickly. If it keeps happening then we probably do have a leak, if reaping never releases enough memory.
The Windows memory pressure work is here:
You can also issue pressure events with kstat, although I've not tried it on Windows, but the code hasn't changed. This is all code that needs to be tweaked to a good default once it is stable enough for it to matter :) Which I guess we are close to now.
What makes me think this is an issue is that this is definitely new behaviour. I never encountered this before doing a lot of copy runs, now I encounter it almost every time.
There are a bunch of stacks like this:
2a88.001b78 ffff953882193080 0000001 Blocked nt!KiSwapContext+0x76
nt!KiSwapThread+0x500
nt!KiCommitThreadWait+0x14f
nt!KeWaitForMultipleObjects+0x2be
OpenZFS!cv_timedwait_hires+0x1a0
OpenZFS!vmem_bucket_alloc+0x703
OpenZFS!vmem_xalloc+0xe51
OpenZFS!vmem_alloc_impl+0x3cf
OpenZFS!vmem_xalloc+0xe51
OpenZFS!vmem_alloc_impl+0x3cf
OpenZFS!vmem_xalloc+0xe51
OpenZFS!vmem_alloc_impl+0x3cf
OpenZFS!kmem_slab_create+0x12d
OpenZFS!kmem_slab_alloc+0xae
OpenZFS!kmem_cache_alloc+0x437
OpenZFS!vmem_alloc_impl+0x10b
OpenZFS!vmem_xalloc+0xe51
OpenZFS!vmem_alloc_impl+0x3cf
OpenZFS!kmem_slab_create+0x12d
OpenZFS!kmem_slab_alloc+0xae
OpenZFS!kmem_cache_alloc+0x437
OpenZFS!zio_data_buf_alloc+0xa6
OpenZFS!zil_itx_create+0x3f
OpenZFS!zfs_log_write+0x32c
OpenZFS!zfs_write+0x16fb
OpenZFS!zfs_write_wrap+0xefb
OpenZFS!fs_write_impl+0x364
OpenZFS!fs_write+0x47e
OpenZFS!fsDispatcher+0x187b
OpenZFS!dispatcher+0x292
nt!IofCallDriver+0x55
+0xfffff8061125710f
+0x2
!locks
for this thread:
Resource @ 0xffff8a8e139f4c40 Exclusively owned
Threads: ffff953882193080-01<*>
Resource @ 0xffff8a8e139f4ca8 Shared 1 owning threads
Threads: ffff953882193080-01<*>
KD: Scanning for held locks..
Resource @ 0xffff8a8e139f3a08 Exclusively owned
Threads: ffff95387fbc8080-01<*>
Resource @ 0xffff8a8e139f3a70 Shared 1 owning threads
Threads: ffff95387fbc8080-01<*>
KD: Scanning for held locks.
Resource @ 0xffff8a8e139f31c8 Exclusively owned
Threads: ffff9538823e1080-01<*>
Resource @ 0xffff8a8e139f3230 Shared 1 owning threads
Threads: ffff9538823e1080-01<*>
KD: Scanning for held locks.
Resource @ 0xffff8a8e139f2988 Exclusively owned
Threads: ffff9538823ce080-01<*>
Resource @ 0xffff8a8e139f29f0 Shared 1 owning threads
Threads: ffff9538823ce080-01<*>
109192 total locks, 8 locks currently held
Above comment was about the ASSERT itself, which I should not have left in there :)
Yeah, it is not doing well there. kmem seems quite wedged, which is shown in the stack. cbuf is most peculiar. More aggressive reaction to memory pressure would be nice I think.
Can I get any more info from this state? It is very close to dying, I saw the memory usage go up very close to maximum. I think it will be difficult to catch it there again without some automation. In any case I captured a memory dump.
I let it run for a few more seconds, memory usage just keeps rapidly increasing, and rclone is completely stalled, can't even terminate the process. I think I'll kill this now and get the kstat logs. I'll try to make a tool I can use to automate breaking if free memory gets low.
Edit: I also checked cbuf again, just more of the same.
The last kstat output_195013.log
kstat logs taken every 10 seconds: kstat_logs.zip
Edit: Next time I need to do this I should log these to a network drive.
# grep inuse output_195013.log | sort -k2n
mem_inuse 772014080
mem_inuse 4348411904
mem_inuse 4849979392
mem_inuse 4850667520
mem_inuse 5224484864
mem_inuse 5224484864
So the biggest magazines are:
name: spl_default_arena class: vmem
mem_inuse 772014080
name: bucket_32768 class: vmem
mem_inuse 4348411904
name: bucket_heap class: vmem
mem_inuse 5224484864
name: heap class: vmem
mem_inuse 5224484864
name: kmem_metadata class: vmem
mem_inuse 363937792
name: kmem_va class: vmem
mem_inuse 4850667520
name: kmem_default class: vmem
mem_inuse 4849979392
Most of those are as expected, but bucket_32768
is weirdly larger than all the others at 4Gb, the next size is bucket_1048576
at 280Mb.
that bucket is also in cbuf a lot:
FFFF9538823E1080: SPL: vmem_freelist_insert_sort_by_time: at marker (bucket_32768)(steps: 0) p->vs_start, end == 0, 65536
FFFF9538823E1080: SPL: vmem_freelist_insert_sort_by_time: at marker (bucket_32768)(steps: 0) p->vs_start, end == 0, 32768
FFFF9538823E1080: SPL: vmem_freelist_insert_sort_by_time: at marker (bucket_heap)(steps: 0) p->vs_start, end == 0, 32768
FFFF95387FBC8080: SPL: vmem_freelist_insert_sort_by_time: at marker (bucket_heap)(steps: 0) p->vs_start, end == 0, 32768
FFFF9538823CE080: SPL: vmem_freelist_insert_sort_by_time: at marker (bucket_32768)(steps: 0) p->vs_start, end == 0, 65536
FFFF9538823CE080: SPL: vmem_freelist_insert_sort_by_time: at marker (bucket_32768)(steps: 0) p->vs_start, end == 0, 32768
FFFF9538823CE080: SPL: vmem_freelist_insert_sort_by_time: at marker (bucket_heap)(steps: 0) p->vs_start, end == 0, 32768
FFFF9538823E1080: SPL: vmem_freelist_insert_sort_by_time: at marker (bucket_heap)(steps: 0) p->vs_start, end == 0, 32768
FFFF95387FBC8080: SPL: vmem_freelist_insert_sort_by_time: at marker (bucket_32768)(steps: 0) p->vs_start, end == 0, 65536
FFFF95387FBC8080: SPL: vmem_freelist_insert_sort_by_time: at marker (bucket_32768)(steps: 0) p->vs_start, end == 0, 32768
FFFF95387FBC8080: SPL: vmem_freelist_insert_sort_by_time: at marker (bucket_heap)(steps: 0) p->vs_start, end == 0, 32768
FFFF953882193080: SPL: vmem_freelist_insert_sort_by_time: at marker (bucket_heap)(steps: 0) p->vs_start, end == 0, 32768
FFFF9538823E1080: SPL: vmem_freelist_insert_sort_by_time: at marker (bucket_32768)(steps: 0) p->vs_start, end == 0, 65536
FFFF9538823E1080: SPL: vmem_freelist_insert_sort_by_time: at marker (bucket_32768)(steps: 0) p->vs_start, end == 0, 32768
FFFF9538823E1080: SPL: vmem_freelist_insert_sort_by_time: at marker (bucket_heap)(steps: 0) p->vs_start, end == 0, 32768
FFFF9538823CE080: SPL: vmem_freelist_insert_sort_by_time: at marker (bucket_heap)(steps: 0) p->vs_start, end == 0, 32768
FFFF953882193080: SPL: vmem_freelist_insert_sort_by_time: at marker (bucket_32768)(steps: 0) p->vs_start, end == 0, 65536
FFFF953882193080: SPL: vmem_freelist_insert_sort_by_time: at marker (bucket_32768)(steps: 0) p->vs_start, end == 0, 32768
FFFF953882193080: SPL: vmem_freelist_insert_sort_by_time: at marker (bucket_heap)(steps: 0) p->vs_start, end == 0, 32768
FFFF9538823E1080: SPL: vmem_freelist_insert_sort_by_time: at marker (bucket_heap)(steps: 0) p->vs_start, end == 0, 32768
FFFF9538823CE080: SPL: vmem_freelist_insert_sort_by_time: at marker (bucket_32768)(steps: 0) p->vs_start, end == 0, 65536
FFFF9538823CE080: SPL: vmem_freelist_insert_sort_by_time: at marker (bucket_32768)(steps: 0) p->vs_start, end == 0, 32768
FFFF9538823CE080: SPL: vmem_freelist_insert_sort_by_time: at marker (bucket_heap)(steps: 0) p->vs_start, end == 0, 32768
FFFF95387FBC8080: SPL: vmem_freelist_insert_sort_by_time: at marker (bucket_heap)(steps: 0) p->vs_start, end == 0, 32768
FFFF95387FBC8080: SPL: vmem_freelist_insert_sort_by_time: at marker (bucket_32768)(steps: 0) p->vs_start, end == 0, 65536
FFFF95387FBC8080: SPL: vmem_freelist_insert_sort_by_time: at marker (bucket_32768)(steps: 0) p->vs_start, end == 0, 32768
What I don't understand with this is that normally memory usage is very stable, the VM has 16GB of memory, and normally there is around 4GB available during the test. But then at some point the memory usage is rising above that and I already know it will crash now.
Just happened again, this time very fast, the horizontal axis is a minute in total, so in under half a minute it suddenly rose smoothly, by 4 gigabyte, after staying stable for 54 minutes, then it crashed.
I tried with the recent change to remove the assert, now it just sits there for a while, seemingly unable to recover and then the debugger breaks like this:
Without a debugger attached, the following bugcheck would have occurred.
eb 0000000000011C94 0000000000000000 1 c0000054
Without a debugger attached, the following bugcheck would have occurred.
eb 0000000000011C94 0000000000000000 1 c0000054
Without a debugger attached, the following bugcheck would have occurred.
eb 0000000000011C94 0000000000000000 1 c0000054
Without a debugger attached, the following bugcheck would have occurred.
eb 0000000000011C94 0000000000000000 1 c0000054
Without a debugger attached, the following bugcheck would have occurred.
eb 0000000000011C94 0000000000000000 1 c0000054
Without a debugger attached, the following bugcheck would have occurred.
eb 0000000000011C94 0000000000000000 1 c0000054
Without a debugger attached, the following bugcheck would have occurred.
eb 0000000000011C94 0000000000000000 1 c0000054
Without a debugger attached, the following bugcheck would have occurred.
eb 0000000000011C94 0000000000000000 1 c0000054
Without a debugger attached, the following bugcheck would have occurred.
eb 0000000000011C94 0000000000000000 1 c0000054
Without a debugger attached, the following bugcheck would have occurred.
eb 0000000000011C94 0000000000000000 1 c0000054
Without a debugger attached, the following bugcheck would have occurred.
eb 0000000000011C94 0000000000000000 1 c0000054
Break instruction exception - code 80000003 (first chance)
nt!MiNoPagesLastChance+0x1b8:
fffff807`76350b70 cc int 3
4: kd> k
# Child-SP RetAddr Call Site
00 ffff9303`e82756c0 fffff807`7635c575 nt!MiNoPagesLastChance+0x1b8
01 ffff9303`e8275790 fffff807`7622d213 nt!MiWaitForFreePage+0x189
02 ffff9303`e8275890 fffff807`7620d1d8 nt!MmAccessFault+0x1fcee3
03 ffff9303`e8275a30 fffff807`76133590 nt!KiPageFault+0x358
04 ffff9303`e8275bc8 fffff807`761128f0 nt!RtlDecompressBufferXpressLz+0x50
05 ffff9303`e8275be0 fffff807`760e55c4 nt!RtlDecompressBufferEx+0x60
06 ffff9303`e8275c30 fffff807`760e5451 nt!ST_STORE<SM_TRAITS>::StDmSinglePageCopy+0x150
07 ffff9303`e8275cf0 fffff807`760e5d7c nt!ST_STORE<SM_TRAITS>::StDmSinglePageTransfer+0xa5
08 ffff9303`e8275d40 fffff807`7611412c nt!ST_STORE<SM_TRAITS>::StDmpSinglePageRetrieve+0x180
09 ffff9303`e8275de0 fffff807`76113f79 nt!ST_STORE<SM_TRAITS>::StDmPageRetrieve+0xc8
0a ffff9303`e8275e90 fffff807`76113e31 nt!SMKM_STORE<SM_TRAITS>::SmStDirectReadIssue+0x85
0b ffff9303`e8275f10 fffff807`76072228 nt!SMKM_STORE<SM_TRAITS>::SmStDirectReadCallout+0x21
0c ffff9303`e8275f40 fffff807`76115307 nt!KeExpandKernelStackAndCalloutInternal+0x78
0d ffff9303`e8275fb0 fffff807`760e0c7c nt!SMKM_STORE<SM_TRAITS>::SmStDirectRead+0xc7
0e ffff9303`e8276080 fffff807`760e06b0 nt!SMKM_STORE<SM_TRAITS>::SmStWorkItemQueue+0x1ac
0f ffff9303`e82760d0 fffff807`76114567 nt!SMKM_STORE_MGR<SM_TRAITS>::SmIoCtxQueueWork+0xc0
10 ffff9303`e8276160 fffff807`76157d9f nt!SMKM_STORE_MGR<SM_TRAITS>::SmPageRead+0x167
11 ffff9303`e82761d0 fffff807`76095900 nt!SmPageRead+0x33
12 ffff9303`e8276220 fffff807`76094b1d nt!MiIssueHardFaultIo+0x10c
13 ffff9303`e8276270 fffff807`76030798 nt!MiIssueHardFault+0x29d
14 ffff9303`e8276330 fffff807`7620d1d8 nt!MmAccessFault+0x468
15 ffff9303`e82764d0 fffff807`73e84b8f nt!KiPageFault+0x358
16 ffff9303`e8276660 fffff807`76a50d40 0xfffff807`73e84b8f
17 ffff9303`e8276668 00007ffb`c93e89f0 nt!MiSystemPartition
18 ffff9303`e8276670 00000000`00000000 0x00007ffb`c93e89f0
FFFFCD814BA6D080: osif_malloc:134: ExAllocatePoolWithTag failed (memusage: 12818907136)
FFFFCD814BA6D080: SPL: vmem_xalloc: vmem waiting for 32768 sized alloc for spl_default_arena, waiting threads 1, total threads waiting = 1
FFFFCD8106295040: dprintf: zfs_vnops_windows.c:7961:fastio_acquire_for_mod_write(): fastio_acquire_for_mod_write:
FFFFCD8106295040: dprintf: zfs_vnops_windows.c:7992:fastio_acquire_for_mod_write(): fastio_acquire_for_mod_write: returning STATUS_CANT_WAIT
FFFFCD8106295040: dprintf: zfs_vnops_windows.c:8006:fastio_acquire_for_mod_write(): fastio_acquire_for_mod_write: returning STATUS_SUCCESS
FFFFCD8106295040: dprintf: zfs_vnops_windows.c:7961:fastio_acquire_for_mod_write(): fastio_acquire_for_mod_write:
FFFFCD8106295040: dprintf: zfs_vnops_windows.c:7992:fastio_acquire_for_mod_write(): fastio_acquire_for_mod_write: returning STATUS_CANT_WAIT
FFFFCD8106295040: dprintf: zfs_vnops_windows.c:8006:fastio_acquire_for_mod_write(): fastio_acquire_for_mod_write: returning STATUS_SUCCESS
FFFFCD8106295040: dprintf: zfs_vnops_windows.c:7961:fastio_acquire_for_mod_write(): fastio_acquire_for_mod_write:
FFFFCD8106295040: dprintf: zfs_vnops_windows.c:7992:fastio_acquire_for_mod_write(): fastio_acquire_for_mod_write: returning STATUS_CANT_WAIT
FFFFCD8106295040: dprintf: zfs_vnops_windows.c:8006:fastio_acquire_for_mod_write(): fastio_acquire_for_mod_write: returning STATUS_SUCCESS
FFFFCD8106295040: dprintf: zfs_vnops_windows.c:7961:fastio_acquire_for_mod_write(): fastio_acquire_for_mod_write:
FFFFCD8106295040: dprintf: zfs_vnops_windows.c:7992:fastio_acquire_for_mod_write(): fastio_acquire_for_mod_write: returning STATUS_CANT_WAIT
FFFFCD8106295040: dprintf: zfs_vnops_windows.c:8006:fastio_acquire_for_mod_write(): fastio_acquire_for_mod_write: returning STATUS_SUCCESS
FFFFCD8106295040: dprintf: zfs_vnops_windows.c:7961:fastio_acquire_for_mod_write(): fastio_acquire_for_mod_write:
FFFFCD8106295040: dprintf: zfs_vnops_windows.c:7992:fastio_acquire_for_mod_write(): fastio_acquire_for_mod_write: returning STATUS_CANT_WAIT
FFFFCD8106295040: dprintf: zfs_vnops_windows.c:8006:fastio_acquire_for_mod_write(): fastio_acquire_for_mod_write: returning STATUS_SUCCESS
FFFFCD8106295040: dprintf: zfs_vnops_windows.c:7961:fastio_acquire_for_mod_write(): fastio_acquire_for_mod_write:
FFFFCD8106295040: osif_malloc:134: ExAllocatePoolWithTag failed (memusage: 12818907136)
FFFFCD8106295040: SPL: vmem_xalloc: vmem waiting for 32768 sized alloc for spl_default_arena, waiting threads 2, total threads waiting = 2
FFFFCD8106295040: SPL: vmem_xalloc: pressure 65536 targeted, 65536 delivered
FFFFCD814BA6D080: SPL: vmem_xalloc: pressure 32768 targeted, 65536 delivered
FFFFCD81066E3040: dprintf: arc_os.c:472:arc_reclaim_thread(): ZFS: arc growtime expired
FFFFD833E0978040: spl_event_thread: LOWMEMORY EVENT *** 0x0 (memusage: 12818907136)
FFFFD833E0978040: spl_event_thread: LOWMEMORY EVENT *** 0x0 (memusage: 12818907136)
FFFFD833E0978040: spl_event_thread: LOWMEMORY EVENT *** 0x0 (memusage: 12818907136)
FFFFD833E0978040: spl_event_thread: LOWMEMORY EVENT *** 0x0 (memusage: 12818907136)
FFFFD833E0978040: spl_event_thread: LOWMEMORY EVENT *** 0x0 (memusage: 12818907136)
FFFFD833E0978040: spl_event_thread: LOWMEMORY EVENT *** 0x0 (memusage: 12818907136)
FFFFD833E0978040: spl_event_thread: LOWMEMORY EVENT *** 0x0 (memusage: 12818907136)
FFFFD833E0978040: spl_event_thread: LOWMEMORY EVENT *** 0x0 (memusage: 12818907136)
FFFFD833E0978040: spl_event_thread: LOWMEMORY EVENT *** 0x0 (memusage: 12818907136)
FFFFD833E0978040: spl_event_thread: LOWMEMORY EVENT *** 0x0 (memusage: 12818907136)
FFFFD833E0978040: spl_event_thread: LOWMEMORY EVENT *** 0x0 (memusage: 12818907136)
FFFFD833E0978040: spl_event_thread: LOWMEMORY EVENT *** 0x0 (memusage: 12818907136)
FFFFD833E0978040: spl_event_thread: LOWMEMORY EVENT *** 0x0 (memusage: 12818907136)
FFFFD833E0978040: spl_event_thread: LOWMEMORY EVENT *** 0x0 (memusage: 12818907136)
FFFFD833E0978040: spl_event_thread: LOWMEMORY EVENT *** 0x0 (memusage: 12818907136)
FFFFD833E0978040: spl_event_thread: LOWMEMORY EVENT *** 0x0 (memusage: 12818907136)
FFFFCD81066E3040: dprintf: arc_os.c:472:arc_reclaim_thread(): ZFS: arc growtime expired
FFFFD833E0978040: spl_event_thread: LOWMEMORY EVENT *** 0x0 (memusage: 12818907136)
FFFFD833E0978040: spl_event_thread: LOWMEMORY EVENT *** 0x0 (memusage: 12818907136)
FFFFD833E0978040: spl_event_thread: LOWMEMORY EVENT *** 0x0 (memusage: 12818907136)
FFFFD833E0978040: spl_event_thread: LOWMEMORY EVENT *** 0x0 (memusage: 12818907136)
FFFFD833E0978040: spl_event_thread: LOWMEMORY EVENT *** 0x0 (memusage: 12818907136)
FFFFD833E0978040: spl_event_thread: LOWMEMORY EVENT *** 0x0 (memusage: 12818907136)
FFFFD833E0978040: spl_event_thread: LOWMEMORY EVENT *** 0x0 (memusage: 12818907136)
FFFFD833E0978040: spl_event_thread: LOWMEMORY EVENT *** 0x0 (memusage: 12818907136)
FFFFD833E0978040: spl_event_thread: LOWMEMORY EVENT *** 0x0 (memusage: 12818907136)
FFFFD833E0978040: spl_event_thread: LOWMEMORY EVENT *** 0x0 (memusage: 12818907136)
FFFFD833E0978040: spl_event_thread: LOWMEMORY EVENT *** 0x0 (memusage: 12818907136)
FFFFD833E0978040: spl_event_thread: LOWMEMORY EVENT *** 0x0 (memusage: 12818907136)
FFFFD833E0978040: spl_event_thread: LOWMEMORY EVENT *** 0x0 (memusage: 12818907136)
OK the low memory events are firing, so that is something. We could trigger another when alloc gets NULL.
Potentially tho, it does seem like a leak. Since we cant reap if they are leaking.
We actually dump out leak memory on module unload. That could be worth checking as well.
I think I should test with the last releases and when I have a good commit I'll do a bisect to find where this started occurring. How can I get the module to unload?
I believe this some information about how to unload the driver https://github.com/openzfsonwindows/openzfs/discussions/287
zpool export -a
zfsinstaller.exe uninstall path/to/OpenZFS.inf
If all goes well:
ahh hmm ok that is amusing
$ out/build/x64-Debug/cmd/zpool/zpool.exe export -a
zunmount(BOOM,E:\ ) running
zunmount(BOOM,E:\ ) returns 0
$ out/build/x64-Debug/cmd/os/windows/zfsinstaller/zfsinstaller.exe uninstall /c/DriverTest/Drivers/OpenZFS.inf
DefaultUninstall 128 C:/DriverTest/Drivers/OpenZFS.inf
SPL: Released 17 slabs
Break instruction exception - code 80000003 (first chance)
> dt cbuf
OpenZFS!cbuf
0xffff800e`1fc10000 "--- memory read error at address 0xffff800e`1fc10000 ---"
ooh yeah, i unloaded it...
OK, so need to put:
diff --git a/module/os/windows/driver.c b/module/os/windows/driver.c
index 930049e1a2..af905e8e2d 100644
--- a/module/os/windows/driver.c
+++ b/module/os/windows/driver.c
@@ -91,6 +91,7 @@ OpenZFS_Fini(PDRIVER_OBJECT DriverObject)
sysctl_os_fini();
spl_stop();
+ DbgBreakPoint();
finiDbgCircularBuffer();
if (STOR_wzvolDriverInfo.zvContextArray) {
The cbuf has:
FFFF800E215F52C0: taskq_delay_dispatcher_thread: exit
FFFF800E1DA8E040: SPL: stopping spl_event_thread
FFFF800E20054080: spl_event_thread: LOWMEMORY EVENT *** 0x0 (memusage: 536866816)
FFFF800E20054080: SPL: spl_event_thread thread_exit
FFFF800E1DA8E040: SPL: stopped spl_event_thread
FFFF800E20813080: SPL: spl_free_thread_exit set to FALSE and exiting: cv_broadcasting
FFFF800E20813080: SPL: spl_free_thread thread_exit
FFFF800E1DA8E040: SPL: tsd unloading 0
FFFF800E1DA8E040: SPL: vmem_fini: stopped vmem_update. Creating list and walking arenas.
FFFF800E1DA8E040: SPL: vmem_fini destroying heap
FFFF800E1DA8E040: SPL: vmem_fini: walking spl_heap_arena, aka bucket_heap (pass 1)
FFFF800E1DA8E040: SPL: vmem_fini: calling vmem_xfree(spl_default_arena, ptr, 268435456);
FFFF800E1DA8E040: SPL: vmem_fini: walking spl_heap_arena, aka bucket_heap (pass 2)
FFFF800E1DA8E040: SPL: vmem_fini: walking bucket arenas...
FFFF800E1DA8E040: SPL: vmem_fini destroying spl_bucket_arenas..
FFFF800E1DA8E040: 4096
FFFF800E1DA8E040: 8192
FFFF800E1DA8E040: 16384
FFFF800E1DA8E040: 32768
FFFF800E1DA8E040: 65536
FFFF800E1DA8E040: 131072
FFFF800E1DA8E040: 262144
FFFF800E1DA8E040: 524288
FFFF800E1DA8E040: 1048576
FFFF800E1DA8E040: 2097152
FFFF800E1DA8E040: 4194304
FFFF800E1DA8E040: 8388608
FFFF800E1DA8E040: 16777216
FFFF800E1DA8E040:
FFFF800E1DA8E040: SPL: vmem_fini: walking vmem metadata-related arenas...
FFFF800E1DA8E040: SPL: vmem_fini walking the root arena (spl_default_arena)...
FFFF800E1DA8E040: SPL: vmem_fini destroying bucket heap
FFFF800E1DA8E040: SPL: vmem_fini destroying vmem_seg_arena
FFFF800E1DA8E040: SPL: vmem_destroy('vmem_seg'): leaked 376832 bytes
FFFF800E1DA8E040: SPL: vmem_fini destroying vmem_hash_arena
FFFF800E1DA8E040: SPL: vmem_destroy('vmem_hash'): leaked 2048 bytes
FFFF800E1DA8E040: SPL: vmem_fini destroying vmem_metadata_arena
FFFF800E1DA8E040: SPL: vmem_destroy('vmem_metadata'): leaked 389120 bytes
FFFF800E1DA8E040: SPL: vmem_fini destroying spl_default_arena
FFFF800E1DA8E040: SPL: vmem_destroy('spl_default_arena'): leaked 425984 bytes
FFFF800E1DA8E040: SPL: vmem_fini destroying spl_default_arena_parant
FFFF800E1DA8E040: SPL: vmem_fini destroying vmem_vmem_arena
FFFF800E1DA8E040: SPL: vmem_destroy('vmem_vmem'): leaked 15408 bytes
FFFF800E1DA8E040: SPL: vmem_destroy('vmem_vmem'): STILL 15408 bytes at kstat_delete() time
FFFF800E1DA8E040: SPL: arenas removed, now try destroying mutexes...
FFFF800E1DA8E040: vmem_xnu_alloc_lock
FFFF800E1DA8E040: vmem_panic_lock
FFFF800E1DA8E040: vmem_pushpage_lock
FFFF800E1DA8E040: vmem_nosleep_lock
FFFF800E1DA8E040: vmem_sleep_lock
FFFF800E1DA8E040: vmem_segfree_lock
FFFF800E1DA8E040: vmem_list_lock
FFFF800E1DA8E040: SPL: vmem_fini: walking list of live slabs at time of call to vmem_fini
FFFF800E1DA8E040: SPL: WOULD HAVE released 0 bytes (0 spans) from arenas
FFFF800E1DA8E040: SPL: vmem_fini: Brief delay for readability...
FFFF800E1DA8E040: SPL: vmem_fini: done!
FFFF800E1DA8E040: SPL: Unloaded module v0.2.3-7-gc1b4a00 (os_mem_alloc: 0)
FFFF800E1DA8E040: timer object is not loaded.
FFFF800E1DA8E040: timer object is not loaded.
ok yeah, there are some things that need addressing even in my create->export example.
Whilst cleaning up the filename issue, I noticed:
int
zfs_build_path(znode_t *start_zp, znode_t *start_parent, char **fullpath,
uint32_t *returnsize, uint32_t *start_zp_offset)
{
...
+ // Free existing, if any
+ if (*fullpath != NULL)
+ kmem_free(*fullpath, *returnsize);
+
+ // Allocate new
*returnsize = size;
ASSERT(size != 0);
So that is one leak, multiple calls to zfs_build_path()
failed to free old name. So think renames..
When checking for the issue in #298 the system ran out of memory again, this time with the zfs_build_path
change applied.
memory.txt
rclone ground to a halt, the console cursor is still blinking, but otherwise the system is unresponsive. A lot of threads show a call stack like this:
When you've already hit the limit, there isn't much you can do, so it would be more interesting to see you do "half" of what you usually do before it hangs, and do the unload, dump cbuf and see if it bitches about particular leaks.
Can't unload if you can't export :\ Lots of this in cbuf:
FFFFC4B54FB25580: Dropping 0 references 2FFFFC4B54FB25580: vnode_drain_delayclose: freeing DEAD vp FFFFBA02267DB2F8
Also the non-paged pool usage seems to not decrease.
I let it continue running for a few seconds and now cbuf is full of zfs_AcquireForLazyWrite: fo FFFFBF60D414F3B0 already freed zfsvfs
Can't see any reads/writes in iostat.
yeah, export, then unload was the idea. If those two cant be done, we certainly have an issue there too
Two stacks are in zfs_AcquireForReadAhead
-> vfs_busy
for WPsettings.dat
.
And zfs_vfs_unmount
is waiting for zfsvfs_teardown
The other zfs_vfs_unmount
is waiting for vfs_busy
Note that I have 2 unmounts hanging at this point, one I started later.
I'm starting to dislike WPsettings.dat
.
I recall some way to disable it, would be interesting to check if my problems go away without it. But I know it also can happen with other files than WPsettings.dat
.
4.001ccc ffffc4b54fa8d040 0004d63 Blocked nt!KiSwapContext+0x76
nt!KiSwapThread+0x500
nt!KiCommitThreadWait+0x14f
nt!KeWaitForSingleObject+0x233
OpenZFS!spl_mutex_enter+0x11f
OpenZFS!vfs_busy+0x19
OpenZFS!zfs_AcquireForReadAhead+0x121
nt!CcPerformReadAhead+0x124
nt!CcWorkerThread+0x2cb
nt!ExpWorkerThread+0x105
nt!PspSystemThreadStartup+0x55
nt!KiStartSystemThread+0x28
# Child-SP RetAddr Call Site
00 ffff8202`2efbf3c0 fffff803`4a41bca0 nt!KiSwapContext+0x76
01 ffff8202`2efbf500 fffff803`4a41b1cf nt!KiSwapThread+0x500
02 ffff8202`2efbf5b0 fffff803`4a41aa73 nt!KiCommitThreadWait+0x14f
03 ffff8202`2efbf650 fffff803`540680cf nt!KeWaitForSingleObject+0x233
04 ffff8202`2efbf740 fffff803`5407b6e9 OpenZFS!spl_mutex_enter+0x11f [H:\dev\openzfs\module\os\windows\spl\spl-mutex.c @ 132]
05 ffff8202`2efbf7b0 fffff803`5434d641 OpenZFS!vfs_busy+0x19 [H:\dev\openzfs\module\os\windows\spl\spl-mount.c @ 59]
06 ffff8202`2efbf7f0 fffff803`4a4e7904 OpenZFS!zfs_AcquireForReadAhead+0x121 [H:\dev\openzfs\module\os\windows\zfs\zfs_vnops_windows.c @ 223]
07 ffff8202`2efbf870 fffff803`4a502b0b nt!CcPerformReadAhead+0x124
08 ffff8202`2efbfa40 fffff803`4a450545 nt!CcWorkerThread+0x2cb
09 ffff8202`2efbfb70 fffff803`4a50e6f5 nt!ExpWorkerThread+0x105
0a ffff8202`2efbfc10 fffff803`4a606278 nt!PspSystemThreadStartup+0x55
0b ffff8202`2efbfc60 00000000`00000000 nt!KiStartSystemThread+0x28
4.001cec ffffc4b54fb18040 0004d63 Blocked nt!KiSwapContext+0x76
nt!KiSwapThread+0x500
nt!KiCommitThreadWait+0x14f
nt!KeWaitForSingleObject+0x233
OpenZFS!spl_mutex_enter+0x11f
OpenZFS!vfs_busy+0x19
OpenZFS!zfs_AcquireForReadAhead+0x121
nt!CcPerformReadAhead+0x124
nt!CcWorkerThread+0x2cb
nt!ExpWorkerThread+0x105
nt!PspSystemThreadStartup+0x55
nt!KiStartSystemThread+0x28
# Child-SP RetAddr Call Site
00 ffff8202`2eff73c0 fffff803`4a41bca0 nt!KiSwapContext+0x76
01 ffff8202`2eff7500 fffff803`4a41b1cf nt!KiSwapThread+0x500
02 ffff8202`2eff75b0 fffff803`4a41aa73 nt!KiCommitThreadWait+0x14f
03 ffff8202`2eff7650 fffff803`540680cf nt!KeWaitForSingleObject+0x233
04 ffff8202`2eff7740 fffff803`5407b6e9 OpenZFS!spl_mutex_enter+0x11f [H:\dev\openzfs\module\os\windows\spl\spl-mutex.c @ 132]
05 ffff8202`2eff77b0 fffff803`5434d641 OpenZFS!vfs_busy+0x19 [H:\dev\openzfs\module\os\windows\spl\spl-mount.c @ 59]
06 ffff8202`2eff77f0 fffff803`4a4e7904 OpenZFS!zfs_AcquireForReadAhead+0x121 [H:\dev\openzfs\module\os\windows\zfs\zfs_vnops_windows.c @ 223]
07 ffff8202`2eff7870 fffff803`4a502b0b nt!CcPerformReadAhead+0x124
08 ffff8202`2eff7a40 fffff803`4a450545 nt!CcWorkerThread+0x2cb
09 ffff8202`2eff7b70 fffff803`4a50e6f5 nt!ExpWorkerThread+0x105
0a ffff8202`2eff7c10 fffff803`4a606278 nt!PspSystemThreadStartup+0x55
0b ffff8202`2eff7c60 00000000`00000000 nt!KiStartSystemThread+0x28
18bc.0021ac ffffc4b54fb25580 000446e Blocked nt!KiSwapContext+0x76
nt!KiSwapThread+0x500
nt!KiCommitThreadWait+0x14f
nt!KeWaitForMultipleObjects+0x2be
OpenZFS!spl_cv_wait+0xea
OpenZFS!rrw_enter_write+0xed
OpenZFS!rrm_enter_write+0x35
OpenZFS!rrm_enter+0x3b
OpenZFS!zfsvfs_teardown+0xcd
OpenZFS!zfs_vfs_unmount+0x209
OpenZFS!zfs_windows_unmount+0x41f
OpenZFS!zfs_ioc_unmount+0x55
OpenZFS!zfsdev_ioctl_common+0x816
OpenZFS!zfsdev_ioctl+0x2c5
OpenZFS!ioctlDispatcher+0x32d
OpenZFS!dispatcher+0x1e6
nt!IofCallDriver+0x55
nt!IopSynchronousServiceTail+0x34c
nt!IopXxxControlFile+0xc71
nt!NtDeviceIoControlFile+0x56
nt!KiSystemServiceCopyEnd+0x28
+0x7ffaf122d0c4
# Child-SP RetAddr Call Site
00 ffff8202`30ba9460 fffff803`4a41bca0 nt!KiSwapContext+0x76
01 ffff8202`30ba95a0 fffff803`4a41b1cf nt!KiSwapThread+0x500
02 ffff8202`30ba9650 fffff803`4a4f7eee nt!KiCommitThreadWait+0x14f
03 ffff8202`30ba96f0 fffff803`54068eba nt!KeWaitForMultipleObjects+0x2be
04 ffff8202`30ba9800 fffff803`5410504d OpenZFS!spl_cv_wait+0xea [H:\dev\openzfs\module\os\windows\spl\spl-condvar.c @ 120]
05 ffff8202`30ba9890 fffff803`54105745 OpenZFS!rrw_enter_write+0xed [H:\dev\openzfs\module\zfs\rrwlock.c @ 219]
06 ffff8202`30ba98d0 fffff803`541056ab OpenZFS!rrm_enter_write+0x35 [H:\dev\openzfs\module\zfs\rrwlock.c @ 371]
07 ffff8202`30ba9910 fffff803`54380cfd OpenZFS!rrm_enter+0x3b [H:\dev\openzfs\module\zfs\rrwlock.c @ 348]
08 ffff8202`30ba9950 fffff803`54380b19 OpenZFS!zfsvfs_teardown+0xcd [H:\dev\openzfs\module\os\windows\zfs\zfs_vfsops.c @ 1458]
09 ffff8202`30ba99b0 fffff803`543c868f OpenZFS!zfs_vfs_unmount+0x209 [H:\dev\openzfs\module\os\windows\zfs\zfs_vfsops.c @ 1653]
0a ffff8202`30ba9b40 fffff803`5437c575 OpenZFS!zfs_windows_unmount+0x41f [H:\dev\openzfs\module\os\windows\zfs\zfs_vnops_windows_mount.c @ 1581]
0b ffff8202`30baa430 fffff803`540838d6 OpenZFS!zfs_ioc_unmount+0x55 [H:\dev\openzfs\module\os\windows\zfs\zfs_ioctl_os.c @ 916]
0c ffff8202`30baa470 fffff803`5437c3a5 OpenZFS!zfsdev_ioctl_common+0x816 [H:\dev\openzfs\module\zfs\zfs_ioctl.c @ 7866]
0d ffff8202`30baa550 fffff803`5435f06d OpenZFS!zfsdev_ioctl+0x2c5 [H:\dev\openzfs\module\os\windows\zfs\zfs_ioctl_os.c @ 866]
0e ffff8202`30baa640 fffff803`5435e976 OpenZFS!ioctlDispatcher+0x32d [H:\dev\openzfs\module\os\windows\zfs\zfs_vnops_windows.c @ 6409]
0f ffff8202`30baa710 fffff803`4a410665 OpenZFS!dispatcher+0x1e6 [H:\dev\openzfs\module\os\windows\zfs\zfs_vnops_windows.c @ 7321]
10 ffff8202`30baa800 fffff803`4a80142c nt!IofCallDriver+0x55
11 ffff8202`30baa840 fffff803`4a801081 nt!IopSynchronousServiceTail+0x34c
12 ffff8202`30baa8e0 fffff803`4a8003f6 nt!IopXxxControlFile+0xc71
13 ffff8202`30baaa20 fffff803`4a610ef8 nt!NtDeviceIoControlFile+0x56
14 ffff8202`30baaa90 00007ffa`f122d0c4 nt!KiSystemServiceCopyEnd+0x28
15 000000fb`a09fcee8 00000000`00000000 0x00007ffa`f122d0c4
2d60.001c40 ffffba0257906080 00017cc Blocked nt!KiSwapContext+0x76
nt!KiSwapThread+0x500
nt!KiCommitThreadWait+0x14f
nt!KeWaitForSingleObject+0x233
OpenZFS!spl_mutex_enter+0x11f
OpenZFS!vfs_busy+0x19
OpenZFS!zfs_vfs_ref+0x73
OpenZFS!getzfsvfs_impl+0x85
OpenZFS!getzfsvfs+0x5e
OpenZFS!zfs_windows_unmount+0x41
OpenZFS!zfs_ioc_unmount+0x55
OpenZFS!zfsdev_ioctl_common+0x816
OpenZFS!zfsdev_ioctl+0x2c5
OpenZFS!ioctlDispatcher+0x32d
OpenZFS!dispatcher+0x1e6
nt!IofCallDriver+0x55
nt!IopSynchronousServiceTail+0x34c
nt!IopXxxControlFile+0xc71
nt!NtDeviceIoControlFile+0x56
nt!KiSystemServiceCopyEnd+0x28
+0x7ffaf122d0c4
# Child-SP RetAddr Call Site
00 ffff8202`2e7cd620 fffff803`4a41bca0 nt!KiSwapContext+0x76
01 ffff8202`2e7cd760 fffff803`4a41b1cf nt!KiSwapThread+0x500
02 ffff8202`2e7cd810 fffff803`4a41aa73 nt!KiCommitThreadWait+0x14f
03 ffff8202`2e7cd8b0 fffff803`540680cf nt!KeWaitForSingleObject+0x233
04 ffff8202`2e7cd9a0 fffff803`5407b6e9 OpenZFS!spl_mutex_enter+0x11f [H:\dev\openzfs\module\os\windows\spl\spl-mutex.c @ 132]
05 ffff8202`2e7cda10 fffff803`5437a633 OpenZFS!vfs_busy+0x19 [H:\dev\openzfs\module\os\windows\spl\spl-mount.c @ 59]
06 ffff8202`2e7cda50 fffff803`54080735 OpenZFS!zfs_vfs_ref+0x73 [H:\dev\openzfs\module\os\windows\zfs\zfs_ioctl_os.c @ 146]
07 ffff8202`2e7cda90 fffff803`540807be OpenZFS!getzfsvfs_impl+0x85 [H:\dev\openzfs\module\zfs\zfs_ioctl.c @ 1376]
08 ffff8202`2e7cdae0 fffff803`543c82b1 OpenZFS!getzfsvfs+0x5e [H:\dev\openzfs\module\zfs\zfs_ioctl.c @ 1391]
09 ffff8202`2e7cdb40 fffff803`5437c575 OpenZFS!zfs_windows_unmount+0x41 [H:\dev\openzfs\module\os\windows\zfs\zfs_vnops_windows_mount.c @ 1492]
0a ffff8202`2e7ce430 fffff803`540838d6 OpenZFS!zfs_ioc_unmount+0x55 [H:\dev\openzfs\module\os\windows\zfs\zfs_ioctl_os.c @ 916]
0b ffff8202`2e7ce470 fffff803`5437c3a5 OpenZFS!zfsdev_ioctl_common+0x816 [H:\dev\openzfs\module\zfs\zfs_ioctl.c @ 7866]
0c ffff8202`2e7ce550 fffff803`5435f06d OpenZFS!zfsdev_ioctl+0x2c5 [H:\dev\openzfs\module\os\windows\zfs\zfs_ioctl_os.c @ 866]
0d ffff8202`2e7ce640 fffff803`5435e976 OpenZFS!ioctlDispatcher+0x32d [H:\dev\openzfs\module\os\windows\zfs\zfs_vnops_windows.c @ 6409]
0e ffff8202`2e7ce710 fffff803`4a410665 OpenZFS!dispatcher+0x1e6 [H:\dev\openzfs\module\os\windows\zfs\zfs_vnops_windows.c @ 7321]
0f ffff8202`2e7ce800 fffff803`4a80142c nt!IofCallDriver+0x55
10 ffff8202`2e7ce840 fffff803`4a801081 nt!IopSynchronousServiceTail+0x34c
11 ffff8202`2e7ce8e0 fffff803`4a8003f6 nt!IopXxxControlFile+0xc71
12 ffff8202`2e7cea20 fffff803`4a610ef8 nt!NtDeviceIoControlFile+0x56
13 ffff8202`2e7cea90 00007ffa`f122d0c4 nt!KiSystemServiceCopyEnd+0x28
14 00000093`0933cea8 00000000`00000000 0x00007ffa`f122d0c4
I think these are the involved threads. I'll get !locks output for them too.
yeah the file itself doesnt do anything magical to break, so there is a bug. it just shows up a lot.
OK, so everything is waiting on vfs_busy(), so someone is holding it, then waiting somewhere else.
Just dump vfs_main_lock
in particular the owner.
How can I do that? Like !locks OpenZFS!vfs_main_lock
or something ?
dt kmutex_t OpenZFS!vfs_main_lock
might work
23: kd> dt kmutex_t OpenZFS!vfs_main_lock
OpenZFS!kmutex_t
Cannot find specified field members.
23: kd> dt OpenZFS!vfs_main_lock
+0x000 m_lock : mutex_t
+0x018 m_owner : 0xffffc4b5`4fb25580 Void
+0x020 m_destroy_lock : 0
+0x028 m_initialised : 0x23456789
is the owner a thread?
yeah,
.thread 0xffffc4b5
4fb25580`
to swap over to it (or look for it in dump)
It is the thread in zfs_vfs_unmount
When testing #281 I noticed that when copying a 5TB dataset using rclone it always ends in an allocation failure:
so
ExAllocatePoolWithTag failed
memory.txt
This seems like a new issue because I was still able to copy the full dataset not that long ago.
I'll try to get some
kstat
information. Is it possible to getkstat
info from the debugger when the failure has already happened? I could also try logging it periodically to a file.