jimsalterjrs / sanoid

These are policy-driven snapshot management and replication tools which use OpenZFS for underlying next-gen storage. (Btrfs support plans are shelved unless and until btrfs becomes reliable.)
http://www.openoid.net/products/
GNU General Public License v3.0
3.14k stars 308 forks source link

Syncoid: Allow hostid as argument #959

Open rayzilt opened 1 month ago

rayzilt commented 1 month ago

When using an ZFS pool in a multihost configuration, the hostname can change based on where the pool is imported. Syncoid creates ephemeral snapshot that contains the hostname from where the command is started. When the pool is failover to another machine, the hostname changes and causes to left over ephemeral snapshots from Syncoid on both sides.

Using --no-sync-snap can be a workaround, but in this situation we like to have ephemeral snapshot of Syncoid.

This PR adds the --hostid argument to specifiy a custom hostname / clustername instead. If not specified it will use the hostname.

jimsalterjrs commented 1 month ago

What does the argument you add in your patch do that the existing --identifier argument does not?

--identifier=

Adds the given identifier to the snapshot and hold name after "syncoid_" prefix 
and before the hostname. This enables the use case of reliable replication to 
multiple targets from the same host.  The following chars are allowed: a-z, A-Z, 
0-9, _, -, : and . .
jimsalterjrs commented 1 month ago

To be clear, using --identifier you will still see the host name in the snapshot name change if, well, the hostname changes. But that won't matter, because the identifier tag is what syncoid will use to identify the sync snapshots, and it will destroy stale ones matching the identifier regardless of what hostname they show.

Personally, I'd consider being able to see which hostname was active during a given sync run as a feature, not a bug.

rayzilt commented 1 month ago

Thank you for looking into this.

I've used the --identifier but noticed that snapshots where still present from the other host after the pool failover. Looking at https://github.com/jimsalterjrs/sanoid/blob/master/syncoid#L1689 it seems that the destroy sub looks at the identifier and hostid.

I'm not syncing to multiple targets, but to two machines that maintain one pool in a multihost configuration.

Maybe I was running into a bug, and responded to soon with this new argument?

rayzilt commented 1 month ago

I'll try to investigate why the --identifier argument is not working on my side, as it should be the same functionality as you pointed out. Thanks :-)

jimsalterjrs commented 1 month ago

I'm not the one who wrote --identifier IIRC, so it's also possible that I misunderstood the way it is supposed to work.

If you can confirm that using --identifier doesn't help with your situation, the next step would be expending some skull sweat figuring out whether it makes more sense to implement your proposed --hostid argument in addition to --identifier, or whether it makes more sense to simply change how --identifier works.

Personally, I'm having some difficulty coming up with a good reason to separate the two functions out as separate arguments. It seems to me as though if you're already examining things to the degree that using --identifier implies, you should be aware enough of what you're doing to recognize that using the same --identifier in places where it might conflict is a problem to avoid.

@phreaker0 do you have any thoughts on this one?