bolthole / zrep

ZREP ZFS based replication and failover script from bolthole.com
Other
251 stars 57 forks source link

Zrep takeover fails after setting master. Reports "bash: line 1: zrep: command not found". #208

Closed davenport1 closed 4 months ago

davenport1 commented 4 months ago

We have zrep managing replication and failover between two debian machines with ZFS. Syncing, Failover, forced failover, and forced takeover all works from both sides, however after setting master on the machine that zrep takeover is run from, it fails.

root@samwise:~$ zrep takeover cloudpool/cloud-docs-data
Starting failover from remote side frodo
Setting readonly on local cloudpool/cloud-docs-data, then syncing
sending cloudpool/cloud-docs-data@zrep_00019e to samwise:cloudpool/cloud-docs-data
Reversing master properties for frodo:cloudpool/cloud-docs-data
Setting master on samwise:cloudpool/cloud-docs-data
bash: line 1: zrep: command not found

Primary machine: frodo Secondary machine: samwise

This puts both the primary and secondary in a readonly state with each machine pointing to the other as host. We can failover normally from both sides.

Any help would be appreciated! Thanks.

davenport1 commented 4 months ago

Here is the status of both machines after as well as the results of doing a regular failover from both machines:

root@samwise:~$ zrep list -v
cloudpool/cloud-docs-data:
readonly        on
zrep:dest-host  samwise
zrep:src-fs     cloudpool/cloud-docs-data
zrep:dest-fs    cloudpool/cloud-docs-data
zrep:savecount  5
zrep:src-host   frodo
last snapshot synced: cloudpool/cloud-docs-data@zrep_00019e
root@frodo:~$ zrep list -v
cloudpool/cloud-docs-data:
readonly        on
zrep:src-fs     cloudpool/cloud-docs-data
zrep:dest-fs    cloudpool/cloud-docs-data
zrep:dest-host  frodo
zrep:src-host   samwise
zrep:savecount  5
last snapshot synced: cloudpool/cloud-docs-data@zrep_00019e

Failover:

root@samwise:~$ zrep failover cloudpool/cloud-docs-data
Setting readonly on local cloudpool/cloud-docs-data, then syncing
sending cloudpool/cloud-docs-data@zrep_0001a0 to frodo:cloudpool/cloud-docs-data
Reversing master properties for samwise:cloudpool/cloud-docs-data
Setting master on frodo:cloudpool/cloud-docs-data
Setting master properties for frodo:cloudpool/cloud-docs-data
root@frodo:~$ zrep failover cloudpool/cloud-docs-data
Setting readonly on local cloudpool/cloud-docs-data, then syncing
sending cloudpool/cloud-docs-data@zrep_00019f to samwise:cloudpool/cloud-docs-data
Reversing master properties for frodo:cloudpool/cloud-docs-data
Setting master on samwise:cloudpool/cloud-docs-data
Setting master properties for samwise:cloudpool/cloud-docs-data
ppbrown commented 4 months ago

the complaint about "bash: line 1: zrep: command not found" suggests that you have installed zrep in a non-standard location.

You need to either

a) add the location to system path b) add symlink /usr/bin/zrep -> actual location c) set the ZREP_PATH variable (going from memory.. you may have to poke through the script yourself for exact name)

davenport1 commented 4 months ago

Thank you for the help!

I ended up moving zrep to the expected location /usr/bin/zrep and setting the ZREP_PATH variable to /usr/bin/zrep/zrep on both machines which seems to have solved the issue. What's strange is I had both my PATH variable to include /usr/lib/zrep and ZREP_PATH set to /usr/lib/zrep/zrep on both machines previously, but having zrep in the bin directory rather than lib solved the issue.