trapexit / mergerfs

a featureful union filesystem
http://spawn.link
Other
4.24k stars 173 forks source link

Need a bit of clearification: how mergerfs handles large files where on no single disk for its own is enought space left but enough on the combined mergerfs location? #937

Closed jkoenig72 closed 3 years ago

jkoenig72 commented 3 years ago

You can also ask questions on Discord for real-time help: https://discord.gg/MpAr69V Hi,

assume the following: 3 disks, all mostly full, each 50G space left.

using mergerfs create one share, shows 150GB free - now when I put one very large with e.g. 110GB file on it... will it work? If so, how? Will the file be partitioned with some ransom name and how this will look like if I check the single disks then?

Thanks! Joerg

jkoenig72 commented 3 years ago

Hmmm... did a quick test, but seems this does not work: put a file on a mergerfs location needs min. one drive where the complete file fits.

It would be very good if this works - my use case: think about all those chia miners (using proof of space). there those plot files are around 103GB of size. normally we use all disks in /mnt/disks/2TB/disk1 ... /mnt/disks/2TB/diskxyz ... /mnt/disks/6TB/disk1 ... etc.

jkoenig72 commented 3 years ago

In general, those large 103 left over some space in each disk (e.g 17G on each 2TB disk) Would be perfect if with something like mergerfs you really could use the disks up to 100%, by splitting up the large file into junks and place them on different disks.

But I guess thats not easy to do... ;-)

anybody knowns some way to address this issue?

ryecoaaron commented 3 years ago

mergerfs does not split files.

jkoenig72 commented 3 years ago

Understood. You know anything that does that?

ryecoaaron commented 3 years ago

raid

jkoenig72 commented 3 years ago

;) true, but issue here with this special use case: its expensive. you want max storage, like with maybe some movie collection or so, raidz2 or raidz3 for e.g. zfs is fine. But for those plot files you do not really care: you want max. space, if a disk fail the plots on it gone, but only on that disk. Raid is too expensive in this use case: it needs too much space, simple JBOD is too dangerous - one gone, all gone.

ideal would be some tool, take all rest free disk space on the dives selected and create a new share, no parity and if a file is too large it splits it up, by some extension on the disk, but on the share it’s the original file.

But maybe there is nothing like that…

ryecoaaron commented 3 years ago

I don't know of anything that will transparently split files and can handle a drive failure that doesn't use a parity drive(s).

jkoenig72 commented 3 years ago

ramdisk… maybe, I will have a look. But thanks!

trapexit commented 3 years ago

mergerfs is specifically intended not to do that kind of stuff. The point of the project is to create a union which is optional. That if a drive is removed or failed it has minimal impact to the rest of the system. Yes, if mergerfs stripped files you'd only lose a couple files extra when one failed but the work required to do that is way more than the return on investment. Huge files and smaller disks or large gaps on drives left over due to exclusively large files being stored is not typical.

trapexit commented 3 years ago

For future reference... the docs are very thorough. If you don't see something it almost certainly doesn't exist.

trapexit commented 3 years ago

ideal would be some tool, take all rest free disk space on the dives selected and create a new share, no parity and if a file is too large it splits it up, by some extension on the disk, but on the share it’s the original file.

You can. Create a file, use LVM to make a volume, do it on all the drives, use raid0 or concatted volume, format, mount.

wiryonolau commented 5 months ago

Hi to clarify, if this things happen is it better to check the disk space using other tools individually before copy ( rsync ) ?