[virtio-fs] Suspected memory leak

SimonFair commented 11 months ago

Describe the bug This is a user report received on the Unraid Forums.

To Reproduce Running VM with virtiofs mappings

Expected behavior No Memory leakage.

Screenshots

Host:

Disto: Unraid/Opensuse
6.1.63/Unknown
QEMU version 7.2/8.1.2
libvirt version 8.7/Unknown

VM:

Windows version 10/11
Which driver has a problem virtiofs or winfsp
Driver version or commit hash that was used to build the driver see images

Additional context Mmdi is found in pooltag.txt so you actually have to use xperf and wpa for further debug. Following the method described there, I captured a snapshot of memory growth, opened it in wpa, loaded symbols, and expanded the Mmdi pool to find stack references to winfsp-x64.dll and virtiofs.exe. So there's the smoking gun, one of these drivers is the culprit.

I upgraded to the latest versions of WinFSP (2.0) and Virtio-win guest tools (0.1.240) and the leak is still active.

mackid1993 commented 11 months ago

I've experienced this exact non-paged pool leak under Windows 11 using the latest released Virtio and WinFSP drivers. I've also tested with the latest released rust virtiofsd. When transferring files the non paged pool grows and the memory is never released. Looking with Poolmon it's always the Mmdi tag ballooning in memory usage.

christophocles commented 11 months ago

I have been running a Windows 10 guest under KVM for about 1 year now, and I have experienced this memory leak for the entire time. I am passing through a GPU and a PCIE USB card, and I also mount a few host folders on the guest using VirtioFS. When there is heavy disk I/O on these VioFS folders, the RAM usage of the guest starts increasing rapidly until it eventually reaches 100% and runs out of swap and then crashes. The rate at which the RAM depletes varies with the amount of disk I/O on the VioFS folders. In the worst case (when backup program is running scanning all files), the RAM usage increases about 1M per second and the crash occurs in about 4 hours (16G of RAM allocated to guest). In order for the backup to complete, I have to reboot the guest multiple times to avoid the system crashing.

I found multiple sources that mentioned KVM memory ballooning causes memory leaks when used in combination with GPU passthrough. I set in the XML config and I disabled blnsvr.exe on the guest. This did not help.

I also tried disabling the virtio serial device. This also did nothing.

I followed Microsoft's guide to track down kernel memory leaks using poolmon. The memory is going to a Non-paged pool with the tag "Mmdi" and the description "MDLs for physical memory allocation".

I provided the debug info in the screenshot above, tracing the Mmdi pool growth to either virtiofs.exe or winfsp-x64.dll. I can assist with any further debug information required.

YanVugenfirer commented 11 months ago

@SimonFair Thank you for your report. Maybe I am missing something, but where is the growth? BTW: to be a leak, the growth need to be consistent (not jumps, that might be related to temporary allocations).

Can you show "before" and "after" allocation counts? Also what's the amount of memory actually allocated?

I any case, worth investigation.

mackid1993 commented 11 months ago

@YanVugenfirer The growth is consistent while a transfer is occuring. It's in the non paged pool. It can be observed with Poolmon and looking at the Mmdi tag as @christophocles explained. When using backup software it grows extremely quickly and will keep growing until it runs out of memory. Stopping the transfer will not free any memory. Only rebooting the VM will.

christophocles commented 11 months ago

@YanVugenfirer here's a screenshot of poolmon showing the kernel memory pool usage after a few hours. The nonpaged pool Mmdi grows continuously grows, unbounded, until the system crashes. The growth is accelerated when there is a lot of disk read/write activity on the virtiofs shares. The list is sorted by bytes allocated, and Mmdi is the highest with 4.2 GB.

poolmon mmdi

And here is poolmon immediately after rebooting the guest. Mmdi is only 4.6 MB.

poolmon mmdi fresh boot

Here is another capture of the Mmdi growth using xperf and wpa. This capture is 6 minutes, with 225MB of memory allocations.

wpa mmdi capture 6min

I am not sure if this bug report should going to this project or to WinFSP. Both seem to be involved with the Mmdi allocations.

YanVugenfirer commented 11 months ago

@christophocles Thanks a lot! We will take a look and investigate.

xiagao commented 11 months ago

I tried to reproduce this issue with the latest rust virtiofsd and virtio driver(242) on Win11 guest, but didn't reproduce it.

Mounted one virtiofs shared dir.
Run fio in the shared dir. C:\Program Files (x86)\fio\fio\fio.exe" --name=stress --filename=Z:/test_file --ioengine=windowsaio --rw=write --direct=1 --size=1G --iodepth=256 --numjobs=128 --runtime=180000 --thread --bs=64k
Monitor with poolmon.exe, but there was no memory leak.

@SimonFair Could you share what the IO operation in your env? Thanks in advance.

YanVugenfirer commented 11 months ago

@SimonFair are you using Rust virtiofsd?

mackid1993 commented 11 months ago

I've tested this on rust Virtiofsd under Unraid and had the Mmdi leak. Perhaps @christophocles has more insight. I believe he used a different distro per our conversation on the Unraid forums and may be able to share what occured on that platform.

mackid1993 commented 11 months ago

@xiagao The latest drivers we were able to get were .240. How do we test with .242? Can you provide a binary for us to test with?

christophocles commented 11 months ago

@SimonFair are you using Rust virtiofsd?

@YanVugenfirer The bug report originated from my system, and others on the Unraid forums have reported the same issue. Yes, I am using Rust virtiofsd 1.7.2 which is the version currently packaged on openSUSE Tumbleweed.

@xiagao I am also using virtio-win driver version 0.240 since that is the latest binary release. I have visual studio and driver toolkits installed so my environment set up to compile newer drivers from source, if needed for testing. Tonight I will spin up a fresh Win10 VM and try to reproduce the leak again myself, with minimum required steps. It's possible that other features my specific system are interacting to trigger the memory leak (i.e. PCI-E passthrough?). If I am able to successfully reproduce the leak on a new VM, I will post detailed steps to reproduce.

kostyanf14 commented 11 months ago

@christophocles @SimonFair

@YanVugenfirer The bug report originated from my system, and others on the Unraid forums have reported the same issue. Yes, I am using Rust virtiofsd 1.7.2 which is the version currently packaged on openSUSE Tumbleweed.

The latest Rust virtiofsd 1.8.0. Please try it.

mackid1993 commented 11 months ago

@kostyanf14 I ran virtiofsd 1.8.0 on Unraid and ran into the same memory leak.

mackid1993 commented 11 months ago

2. C:\Program Files (x86)\fio\fio\fio.exe" --name=stress --filename=Z:/test_file --ioengine=windowsaio --rw=write --direct=1 --size=1G --iodepth=256 --numjobs=128 --runtime=180000 --thread --bs=64k

Where is fio.exe? I only have C:\Program Files\Virtio-Win\VioFS\virtiofs.exe

xiagao commented 11 months ago

C:\Program Files (x86)\fio\fio\fio.exe" --name=stress --filename=Z:/test_file --ioengine=windowsaio --rw=write --direct=1 --size=1G --iodepth=256 --numjobs=128 --runtime=180000 --thread --bs=64k

Where is fio.exe? I only have C:\Program Files\Virtio-Win\VioFS\virtiofs.exe

Hi, you can find fio binary in https://fio.readthedocs.io/en/latest/fio_doc.html .

xiagao commented 11 months ago

@kostyanf14 I ran virtiofsd 1.8.0 on Unraid and ran into the same memory leak. Could you share what io test did you do on the shared folder? I also will try some other tools, such as iozone and iometers.

mackid1993 commented 11 months ago

What always does it for me is a free trial of Backblaze Personal Backup and letting it back up my large media library stored on a VirtioFS mount. That will cause Mmdi to grow very quickly.

mackid1993 commented 11 months ago

I should also add I use this batch script to mount several Unraid shares as different drive letters:

"C:\Program Files (x86)\WinFsp\bin\launchctl-x64.exe" start virtiofs viofsJ Tag1 J: "C:\Program Files (x86)\WinFsp\bin\launchctl-x64.exe" start virtiofs viofsl Tag2 l: "C:\Program Files (x86)\WinFsp\bin\launchctl-x64.exe" start virtiofs viofsM Tag3 m: "C:\Program Files (x86)\WinFsp\bin\launchctl-x64.exe" start virtiofs viofsS Tag4 s: "C:\Program Files (x86)\WinFsp\bin\launchctl-x64.exe" start virtiofs viofsT Tag5 T:

I previously ran: C:\Program Files (x86)\WinFsp\bin\fsreg.bat" virtiofs "C:\Program Files\Virtio-Win\VioFS\virtiofs.exe" "-t %%1 -m %%2"

mackid1993 commented 11 months ago

Has anyone been able to repro this?

xiagao commented 11 months ago

I reproduced this issue with multiple source mapping from host to Win11 guest. Using IOmeter software to create a lot of disk read/write activity on the virtiofs shares. Here are some screenshots showing nonpaged pool Mmdi grows continuously after starting IO test and the memory isn't released after stop IO test. mmdi mmdi2 mmdi4 mmdi5

mackid1993 commented 11 months ago

@xiagao I'm glad it's not just us! Thank you for your effort. So hopefully this can eventually be fixed!

xiagao commented 11 months ago

@xiagao I'm glad it's not just us! Thank you for your effort. So hopefully this can eventually be fixed! No problem.
Thanks for reporting this issue.

mackid1993 commented 11 months ago

Thank you! Can't wait to finally use Virtiofs.

SimonFair commented 11 months ago

Is there a fix for the issue or has the root cause been found?

YanVugenfirer commented 11 months ago

@SimonFair not yet. Due to holidays time we are not yet got to debug it.

starlit-rocketship commented 10 months ago

Just came to report my issue with the memory leaks too.

Running a win11 VM for security cameras writing about 50mbps constantly over virtiofs will chew up my 16gb allocated ram in about 24 hours.

Hope your team had a good holiday period and will look back into this in coming weeks / months for any updates.

mackid1993 commented 10 months ago

It'll be great to know if any progress has been made on this bug?

YanVugenfirer commented 10 months ago

@mackid1993 No progress due to the holiday season

YanVugenfirer commented 10 months ago

@kostyanf14 and I found an issue that caused the memory leak (hopefully the only one). Soon CI will build the driver that can be tested if anyone is interested.

mackid1993 commented 10 months ago

@YanVugenfirer Can you please provide a link to the driver once it's been built? Thank you!

SimonFair commented 10 months ago

Thanks for the update.

mackid1993 commented 10 months ago

I loaded up the new driver from here: https://www.dropbox.com/scl/fo/ssn368eky8yykwwxpuagh/h?rlkey=de69gptabrqi8ihu3x3nakwhu&lst=&dl=0 in test mode and so far so good. I'm going to let some disk activity run overnight but I don't see a leak thus far.

@YanVugenfirer will this fix make it into the next stable release?

mackid1993 commented 10 months ago

I let Backblaze Personal Backup go for a little bit and it seems like the non-paged pool is growing but only when there is virtiofs disk activity. I'm not sure which tag is causing it but here is a screenshot. It seems like there is a much slower leak now.

YanVugenfirer commented 10 months ago

@YanVugenfirer will this fix make it into the next stable release?

Yes.

kostyanf14 commented 10 months ago

I let Backblaze Personal Backup go for a little bit and it seems like the non-paged pool is growing but only when there is virtiofs disk activity. I'm not sure which tag is causing it but here is a screenshot. It seems like there is a much slower leak now.

I am not sure if is this a leak or not. During disk activity, VirtioFS allocates a lot of resources but all of them should be freed when activity stops. Need to check that diff grows between start and stop.

mackid1993 commented 10 months ago

@kostyanf14 It appeared to. When I stopped Backblaze the non-paged pool stopped growing. When I started it again it began to slowly grow. It's better, but I think there may be still a smaller leak.

willdrew commented 10 months ago

@YanVugenfirer, thank you for working on this and getting that driver built for testing, very much appreciated!

@mackid1993 , all, I also have a Windows VM (Win11) running Backblaze, with it backing up a few TBs of data from over 1/2 a million files. The Windows OS VM would crash with OOM, after about 6 hours w/o the default Windows OS VM swap (pagefile) during my testing, and roughly a day w/ swap (the default pagefile), but the system would eventually become unresponsive just the same. Note, the VM is allocated 32GB of RAM, I will lower this back to 16GB and test this further later, on another day (maybe tomorrow).

I'm now using the recently built driver above in Windows Test Mode (still w/o swap) and it's been backing up for over 13h now w/o issue. And I see the free memory go up and down, with some of it being released from time to time, which appears to be somewhat corresponding with the release of process handles (presumably Blackblaze threads, but I am not running kernel pool monitor, at this time). And that makes sense to me and appears to be building confidence of expected behavior.

I did pause the backup and quit the Backblaze control panel, but it did not free memory. However, that could be due to the service presumably still being loaded and in a paused state ... which I believe is the case, but did not confirm. So, I am not sure if there is a smaller memory leak or not, or if that's just expected behavior from Backblaze with it holding on to some of the memory in a paused state.

I just set the threads from 8 (default) with automatic throttling, to 20, which has already started using more resources ... I am going to let it run overnight to see what happens, keeping fingers crossed; will provide a brief update tomorrow.

willdrew commented 10 months ago

I just set the threads from 8 (default) with automatic throttling, to 20, which has already started using more resources ... I am going to let it run overnight to see what happens, keeping fingers crossed; will provide a brief update tomorrow.

It's still running and memory usage looks good :tada: I'm going to let it run some more today. Afterwards, I'll lower RAM to 16GB later today to see if it runs okay over the weekend.

mackid1993 commented 10 months ago

I just set the threads from 8 (default) with automatic throttling, to 20, which has already started using more resources ... I am going to let it run overnight to see what happens, keeping fingers crossed; will provide a brief update tomorrow.

It's still running and memory usage looks good 🎉 I'm going to let it run some more today. Afterwards, I'll lower RAM to 16GB later today to see if it runs okay over the weekend.

Thanks for this. I'm going to test further.

mackid1993 commented 10 months ago

@willdrew How does your non-paged pool look?

willdrew commented 10 months ago

@mackid1993 Screenshot from 2024-01-12 13-02-27

mackid1993 commented 10 months ago

I may have to retract my previous statement. This time Backblaze has been going on 100 threads for 2 hours and I'm sitting right here:

I'm going to let it run through tomorrow and report back but I think we are actually looking pretty good.

willdrew commented 10 months ago

Thanks for the update. That's positive to hear and will keep my fingers crossed. I just bumped my threads to 100 as well, and will let it run for a few hours. I was already maxing out the bandwidth on the lower threads, but more threads should still push things a bit harder when it comes to memory usage; will report back.

mackid1993 commented 10 months ago

I have mine set like this:

I'm also running 8 VirtioFS mounts right now.

willdrew commented 10 months ago

Also, I ended up installing PoolMonX, so here's a screenshot sorted by NPaged Usage. Screenshot from 2024-01-12 13-14-28

willdrew commented 10 months ago

Cool, I've set mine similar. However, I only have a single VirtioFS mount, which is backed by a 20TB ZFS RAID1 volume with a few zpools :crossed_fingers:
Screenshot from 2024-01-12 13-22-57

mackid1993 commented 10 months ago

@willdrew If you're interested in running more mounts I wrote up a couple of batch scripts to simplify that. Here's a link to my post on the Unraid forums: https://forums.unraid.net/topic/129352-virtiofs-support-page/?do=findComment&comment=1301832

mackid1993 commented 10 months ago

I can confirm pausing Backblaze and waiting 10-15 minutes causes non-paged pool memory to be released!

SimonFair commented 10 months ago

@YanVugenfirer, thank you for working on this and getting that driver built for testing, very much appreciated!

@mackid1993 , all, I also have a Windows VM (Win11) running Backblaze, with it backing up a few TBs of data from over 1/2 a million files. The Windows OS VM would crash with OOM, after about 6 hours w/o the default Windows OS VM swap (pagefile) during my testing, and roughly a day w/ swap (the default pagefile), but the system would eventually become unresponsive just the same. Note, the VM is allocated 32GB of RAM, I will lower this back to 16GB and test this further later, on another day (maybe tomorrow).

I'm now using the recently built driver above in Windows Test Mode (still w/o swap) and it's been backing up for over 13h now w/o issue. And I see the free memory go up and down, with some of it being released from time to time, which appears to be somewhat corresponding with the release of process handles (presumably Blackblaze threads, but I am not running kernel pool monitor, at this time). And that makes sense to me and appears to be building confidence of expected behavior.

I did pause the backup and quit the Backblaze control panel, but it did not free memory. However, that could be due to the service presumably still being loaded and in a paused state ... which I believe is the case, but did not confirm. So, I am not sure if there is a smaller memory leak or not, or if that's just expected behavior from Backblaze with it holding on to some of the memory in a paused state.

I just set the threads from 8 (default) with automatic throttling, to 20, which has already started using more resources ... I am going to let it run overnight to see what happens, keeping fingers crossed; will provide a brief update tomorrow.

Thanls for your testing.

mackid1993 commented 10 months ago

I'm going to let Backblaze chew through 9TB of data over the long weekend and shoot for 48+ hours of uptime but I think this was it!

virtio-win / kvm-guest-drivers-windows

[virtio-fs] Suspected memory leak #1004