Open bsper2 opened 2 years ago
A simple solution might be to replace the "tar pipe" with "recursive rsync" at https://github.com/ncsa/xcat-tools/blob/e6afaca43a3d1e61fd9405ac4711d0c43db74134/cron_scripts/backup-node_configs.sh#L61
The resulting solution would likely involve 3 steps:
As an added benefit, this would remove the need for the "isstale" function as well as the "REFRESH*" and "MIN_BKUP_DATE" variables and checks. The resulting code should be shorter and cleaner ... thus easier to maintain. This is a win!
FORCE is probably still a useful cmdline option, but it's logic would change (maybe just delete the local copy; or maybe add a flag to rsync to "skip all checks and copy files"; the latter being safer so a backup isn't lost if the rsync copy fails).
SVCPLAN-2056 has been setup to track this
Ran into an issue in SECURITY-1380 where stateless nodes lost their keytabs.
Currently backup-node_configs.sh will backup files every x days (7 by default). So if a file is modified (like keytabs will when they refresh the 1st of the month) it can take up to 7 days before that file is backed up. This is a problem for stateless nodes, especially ones that reboot weekly.
As a work around I set theREFRESH_DAYS=1
on ngale. This isn't too bad on ngale since there aren't too many nodes, but having it copy all files for all nodes every day might be a bit heavy handed and not ideal especially for larger clusters.Edit: the above doesn't actually fix this. The logic to see if it should backup the file is based on the modify time of the backed up file ON the xcat node, and does not look at the modify time of the file as it is on the node being backed up:
The best way to force the backup of files every time is to set
REFRESH_DAYS=0
or just add the-f
flag when executing the script.Just thinking of ideas, but maybe we setup is_stale so that for each file it compares the backup-copy to the file on the node, and then does the backup only for files that are different. That way a daily run of backup-node_configs.sh would grab more up-to-date files, and since it only copies files that changed it's not copying files needlessly.
Also might adjust the times that the keytabs get updated and when the backup runs. Right now the keytabs update a little after 8AM:
1 8 1 * * sleep $((RANDOM \% 15))m && k5srvutil change
and the backup-node_configs.sh runs at midnight daily. So adjusting timings so that the backups happen not too long after the keytab update would be good too.