markusressel / zfs-inplace-rebalancing

Simple bash script to rebalance pool data between all mirrors when adding vdevs to a pool.
Creative Commons Zero v1.0 Universal
342 stars 31 forks source link

Handle Hardlinks #42

Open nickdemise opened 3 months ago

nickdemise commented 3 months ago

Hi all,

I would like to discuss, and expand my knowledge, about potentially enhancing the script to handle hardlinks, rather than simply ignoring them.

In #22 johnpyp mentions "(data) can't be trivially un-hardlinked after without knowledge of which path is in the "balance target" vdev." I don't quite understand this, could you please expand? Lets say we don't know the path of the balance target vdev and pick a path at random, in the end would it not average out ?

Lets say i have 2vdevs, and they are 60% populated, with files that each have one or more hardlinks, which is balanced 50-50 data usage. I now add a new empty equal size vdev to the pool and wish to rebalance.

We look at file_1, its got two hardlinks (50%-50%-0) Could i copy file_1 --> file_1_tmp Delete the two hardlinks & file_1 Create two new hardlinks to file_1_tmp Rename file_1_tmp

What does the end data result look like? (15%-15%-70% or so?)

I understand i could be quite naive/crude in this approach, but wish to understand and hope to resolve this, as i'm sure alot of people with similar *arr media setups would appreciate such a feature.

Thanks, NickyD

Xaelias commented 1 month ago

I'm curious too because that sounds like a solvable problem. I'm also a bit confused by

without knowledge of which path is in the "balance target" vdev."

We know which path is which. My understanding is that this script keeps track of everything that was processed or not and we just created a copy of the file.

nickdemise commented 3 weeks ago

did anyone have any thoughts/comments on if this would be doable ?

hbilbo commented 3 weeks ago

See #46

nickdemise commented 3 weeks ago

See #46

Cheers mate i'll go through it in more detail, but just quickly, does this only work for files with precicely 2 hardlinks ?

hbilbo commented 2 weeks ago

Yes it specifically checks if the file has 2 hardlinks and only processes those. I don't have a use case for anything more than that but you could pretty easily modify the script to handle files with 3 hardlinks if that was a legitimate use case.