Closed mpicjohn closed 5 years ago
what is the shell output if you run?
/opt/znapzend/bin/znapzend -d --runonce=ssd_pool
Just recreated the configuration for the pool with znapzendzetup. Seems to work now. Snaps are done and send to remote server. Perhaps I had messed around with the recursive setting of some datasets in the past to fix a similar hang. Need to wait now for the run to complete (the two pools use some 190TB).
BTW, what is the exact meaning of "recursive"? Does it only affect the creation of the config, or the actual processing of the snaps?
It means all "sub" zfs of the zfs with the znapzend plan are snapshoted as well with the same plan.
... regardless of specific settings in one of the "subs", right?
OK, after deleting all configurations for each sub-zfs individually, and recreating it recursively from the root zfs, everything is back to normal now. Sorry for the noise...
I have a file server here with two pool (ssd_pool and fast_pool). Suddenly znapzend stopped to create znapshots (locally) for one of the pool (the other one works as exspected. Znapzend runs as a service on OpenIndiana.
Starting command line:
/opt/znapzend/bin/znapzend --daemonize --pidfile=/dev/null --autoCreation --connectTimeout=200 --logto=/var/log/znapzend.log --loglevel=debug --features=recvu
Znapzend configuration for affected pool:
`/opt/znapzend/bin/znapzendzetup list ssd_pool
backup plan: ssd_pool dst_1 = root@zfs-backup-1-binf:data_pool/zfs-mirror/cluster-filer/ssd_pool dst_1_plan = 3weeks=>1day,3months=>1week enabled = on mbuffer = /usr/bin/mbuffer mbuffer_size = 4G post_znap_cmd = off pre_znap_cmd = off recursive = on src = ssd_pool src_plan = 1day=>1hour,14days=>1day tsformat = %Y-%m-%d-%H%M%S zend_delay = 0`
Relevant log entries:
[Thu Nov 1 08:00:00 2018] [debug] snapshot worker for slow_pool spawned (21495) [Thu Nov 1 08:00:00 2018] [info] creating recursive snapshot on slow_pool [Thu Nov 1 08:00:00 2018] [debug] snapshot worker for ssd_pool/reserved_space spawned (21497) [Thu Nov 1 08:00:00 2018] [info] creating snapshot on ssd_pool/reserved_space [Thu Nov 1 08:00:00 2018] [debug] snapshot worker for ssd_pool spawned (21499) [Thu Nov 1 08:00:00 2018] [info] creating recursive snapshot on ssd_pool [Thu Nov 1 08:00:00 2018] [debug] snapshot worker for ssd_pool/reserved_space done (21497) [Thu Nov 1 08:00:00 2018] [debug] send/receive worker for ssd_pool/reserved_space spawned (21501) [Thu Nov 1 08:00:00 2018] [info] starting work on backupSet ssd_pool/reserved_space [Thu Nov 1 08:00:00 2018] [debug] sending snapshots from ssd_pool/reserved_space to root@zfs-backup-1-binf:data_pool/zfs-mirror/cluster-filer/ssd_pool [Thu Nov 1 08:00:00 2018] [warn] ERROR: snapshot(s) exist on destination, but no common found on source and destination clean up destination root@zfs-backup-1-binf:data_pool/zfs-mirror/cluster-f iler/ssd_pool (i.e. destroy existing snapshots) [Thu Nov 1 08:00:00 2018] [warn] ERROR: suspending cleanup source dataset because at least one send task failed [Thu Nov 1 08:00:00 2018] [info] done with backupset ssd_pool/reserved_space in 0 seconds [Thu Nov 1 08:00:00 2018] [debug] send/receive worker for ssd_pool/reserved_space done (21501) [Thu Nov 1 08:00:00 2018] [warn] taking snapshot on ssd_pool failed: ERROR: cannot create snapshot ssd_pool@2018-11-01-080000 [Thu Nov 1 08:00:00 2018] [debug] snapshot worker for ssd_pool done (21499) [Thu Nov 1 08:00:00 2018] [debug] send/receive worker for ssd_pool spawned (21505) [Thu Nov 1 08:00:00 2018] [info] starting work on backupSet ssd_pool
Obviously the line: ERROR: cannot create snapshot ssd_pool@2018-11-01-080000 shows the problem
When running manually (--runonce) I get the following in the logfile:
taking snapshot on ssd_pool failed: ERROR: cannot create snapshot ssd_pool@2018-11-01-091559 [Thu Nov 1 09:16:00 2018] [debug] snapshot worker for ssd_pool done (12246)
and at the command line:
cannot open 'ssd_pool@2018-11-01-091559': dataset does not exist
something is terribly screwed up, resulting in a broken backup....
Any help is really appreciated.
thx
Carsten