jimsalterjrs / sanoid

These are policy-driven snapshot management and replication tools which use OpenZFS for underlying next-gen storage. (Btrfs support plans are shelved unless and until btrfs becomes reliable.)
http://www.openoid.net/products/
GNU General Public License v3.0
3.07k stars 300 forks source link

Use of uninitialized value $checkmutex #790

Open devZer0 opened 1 year ago

devZer0 commented 1 year ago

with sanoid version 2.1.0 i'm getting on manual invocation


# sanoid --prune-snapshots --verbose
Use of uninitialized value $checkmutex in scalar chomp at /usr/sbin/sanoid line 1464.
Use of uninitialized value $checkmutex in string eq at /usr/sbin/sanoid line 1466.
INFO: cache expired - updating from zfs list.
phreaker0 commented 1 year ago

May be fixed in master already, can you try the latest commit?

devZer0 commented 1 year ago

it does not seem so that it's fixed there


root@pve-bigiron:~/sanoid# git checkout master
Already on 'master'
Your branch is up to date with 'origin/master'.
root@pve-bigiron:~/sanoid# ./sanoid --prune-snapshots --verbose
INFO: pruning snapshots...
Use of uninitialized value $checkmutex in scalar chomp at ./sanoid line 1464.
Use of uninitialized value $checkmutex in string eq at ./sanoid line 1466.
INFO: pruning replipool/pve-cluster3/pve-pc5-ssdpool2-vms-files-zstd@autosnap_2022-12-27_22:00:00_hourly ...
INFO: pruning replipool/pve-cluster3/pve-pc5-ssdpool2-vms-files-zstd@autosnap_2022-12-27_23:00:00_hourly ...
phreaker0 commented 1 year ago

i can't reproduce it here. can you post the content of all files in /var/run/sanoid?

devZer0 commented 1 year ago

root@pve-bigiron:/var/run/sanoid# ls -la total 8 drwxr-xr-x 2 root root 80 Jan 10 20:18 . drwxr-xr-x 34 root root 1440 Jan 10 20:16 .. -rw-r--r-- 1 root root 58 Jan 10 20:18 sanoid_cacheupdate.lock -rw-r--r-- 1 root root 59 Jan 10 18:04 sanoid_pruning.lock

root@pve-bigiron:/var/run/sanoid# cat sanoid_cacheupdate.lock 913700 /usr/bin/perl ./sanoid --prune-snapshots --verbose

root@pve-bigiron:/var/run/sanoid# cat sanoid_pruning.lock 2969049 /usr/bin/perl ./sanoid --prune-snapshots --verbose

perl 5.32.1-4+deb11u2

phreaker0 commented 1 year ago

seems like the cache update lock didn't got removed after a sanoid run: please post the output of

ps -p 913700 -o args=

and then remove both lockfiles and run sanoid twice to check if the error is gone.

devZer0 commented 1 year ago

there is no process 913700

indeed, the message seems to be caused by the leftover file. after removing /var/run/sanoid contents, the warning is gone.

anyhow, when there are files leftover for previous sanoid run, the message is a bit weird/misleading. maybe there is room for improvement?

alberanid commented 4 months ago

I'm experiencing the same problem:

Use of uninitialized value $checkmutex in scalar chomp at /usr/sbin/sanoid line 1464.
Use of uninitialized value $checkmutex in string eq at /usr/sbin/sanoid line 1466.
ERROR: No valid lockfile found - Did a rogue process or user update or delete it?

My configuration is quite simple; the template is:

[template_production]
frequently = 0
hourly = 0
daily = 1
monthly = 4
yearly = 3
autosnap = yes
autoprune = yes

and is used by just 8 datasets, without much data in them.

The crontab is a simple * * * * * TZ=UTC /usr/sbin/sanoid --cron > /dev/null

I have no files in /var/run/sanoid/ or /run/sanoid/.

Versions: (Getopt::Long::GetOptions version 2.52; Perl version 5.36.0), running on Proxmox 8.2.2, zfs 2.2.3.

Does anyone knows how I can debug the issue?

Thanks!

alberanid commented 4 months ago

I think I have find the culprit of this problem.

I added an "every minute" cron job, while I didn't notice that on Proxmox (and maybe also in Debian / Ubuntu), there is already a periodic job run every 15 minutes through systemd or cron.

@devZer0 can you please confirm that also in your case you have multiple cron jobs?

The cron job is in /etc/cron.d/ or you can check the status of systemd status sanoid.timer