maharmstone / btrfs

WinBtrfs - an open-source btrfs driver for Windows
GNU Lesser General Public License v3.0
5.84k stars 226 forks source link

Memory leak causing massive memory usage. #395

Closed 8465231 closed 10 months ago

8465231 commented 3 years ago

So I was moving some BTRFS formatted drives from my chia plotter to my windows farmer.

I noticed that I was getting out of memory errors even though I have 32gb of ram.

After some investigation I have narrowed it down to the BTRFS formatted drives. I had ~15tb of plots being farmed and it would sit around ~28gb of usage when using BTRFS drives.

I converted a drive to NTFS and swapped it with a few BTRFS drives and my memory usage dropped to a few hundred mb as expected.

I plan on converting some more drives to NTFS but from initial tests it is 100% linked to the BTRFS formatted drives as the same amount of plots on NTFS formatted drives uses hardly any ram at all.

From searching it seems a memory leak was an issue some time back, was it ever fixed?

maharmstone commented 3 years ago

Chia plotter? Windows farmer? I've no idea what you're going on about...

There's no outstanding memory leaks that I'm aware of.

8465231 commented 3 years ago

Chia plotter? Windows farmer? I've no idea what you're going on about...

There's no outstanding memory leaks that I'm aware of.

lol, basically I have ~20 BTRFS drives connected to my windows system with a total of ~15tb of storage.

These drives sit idle 99% of the time but every now and then they will do some small reads. And during start up a scan of the drive is done to see how many files are there and if they are valid.

During this scan the memory usage goes from ~2gb to 25-28gb with BTRFS drives. by switching them to NTFS it remains around the expected few hundred mb of increased ram usage.

Ram usage will keep climbing over time as the small reads are done as well.

The only variance is the drive format between small memory usage and massive memory usage.

I am switching over most of my drives to NTFS and will know for sure tomorrow if the effects scale with more ntfs drives. Just keeping the raid arrays as btrfs since that is simpler then trying to use windows software raid.

TheMadHau5 commented 3 years ago

Chia plotter? Windows farmer? I've no idea what you're going on about...

They're likely talking about chiacoin, which uses storage space to farm (mine) new coins instead of cpu/gpu usage. It's heavily storage intensive, especially during the plotting process but they say that they did that on linux, and then are using windows (with btrfs) to farm the plots (a.k.a mine for new coins) which is not as intensive so the high memore usage is unexpected.

oxygen commented 3 years ago

The harvester process (and all the other processes) don't keep file handles open to your plot files (checked using Process Explorer). I am clueless to how Windows or this project work around here, but if I were to guess, if there was some kind of cache eviction based on no open file handles to a file then it is not getting either honored or triggered depending on where the cache is (who is managing the cache).

To confirm this, you could monitor RAM increases as reads happen and match them to the total MBs read from disk by the harvester (you can use Process Explorer for that).


To help developing a test which should catch this issue here's what's happening on the OP's machine:

Many drives each filled to the top with ~101 GiB files. Many small reads randomly accross all of these files, with file handles closed immediately after the small reads are done. This process is forever.

The reproduce using the actual software the OP is using:

The OP is NOT hitting this issue during the storage intensive operation of plotting, but during harvesting (endless rare and far between random small reads in random files).

8465231 commented 3 years ago

Correct on both counts, plotting is done on a separate machine, the windows machine is only being used for farming the plots.

Technically I am using hpool farmer (hence why I have it on a sandboxed machine) which might do a more intense file check when it starts up but that can be mimicked on the official app by running a plot check I think.

During the check the drives usage goes to 100% with a lot of random reads, it lasts ~30-90 seconds then it drops to normal farming access levels of tiny reads every little bit.

I am in the process of converting most of my drives to NTFS and should be back up in running with the same setup as before except with mostly NTFS drives later today and will know for sure what the results are. Technically I will have a few more TB worth of plots/drives now.

8465231 commented 3 years ago

Ok, just got done converting all my drives over 500gb to NTFS, keeping the small drives in a BTRFS raid as it is much simpler in this case and I have enough memory to handle it right now.

I now have 4 blockchains running side by side (vs 1 before) and more drives / plots but my total memory usage is ~7.5gb and I mining with nicehash on a RX 570 as well vs 28GB for a single blockchain and fewer drives / plots when using BTRFS.

Since the only things that was changed that would reduce memory usage was the swap from BTRFS to NTFS. Safe to say the memory leak was somehow connected to BTRFS. Every other change increases memory usage.

I am no dev/programmer so no idea on the technical side of things, just reporting my experiences.

Artofeel commented 3 years ago

probably I have the same issue but I'm not using some farming software so, I have volume with 2,6 millions of files, about of 700GB size in total, compressed with zstd (level 2) I'm trying to archive them with zpaq the more files are read, the more memory used in Non-Paged-Pool will be

memory_usage trying to use poolmon, tag MHBt is related? poolmon_1 poolmon_2

guglovich commented 2 years ago

I think I have the same problem. I've been sitting on a torrent for 6 years, several TBs 24/7 with no problems. A month ago I switched to BTRFS, my uploads were slow but not much speed. And now for a few days I have a constantly clogged internet channel, high speed, and the RAM cache is constantly 100% busy. I then restarted the system, the uploads immediately went active and in half an hour it was already 10GB of cache.

guglovich commented 2 years ago

To further check, I ran defragmentation with compression and after 2 minutes the cache was 11GB.

olivatooo commented 1 year ago

I have a similar problem, if my laptop stays idle, brtfs uses almost my entire RAM

rautamiekka commented 1 year ago

It's by design that the OS uses all the RAM for caching cuz it maximizes performance, which is a totally different matter from memory leak. The OS will surrender the cached area as a process asks.

guglovich commented 1 year ago

That's right, almost. There is an option to disable the file cache, but BTRFS ignores it.

maharmstone commented 10 months ago

Closing old issues