Closed adabru closed 4 years ago
Thank you for submitting your idea. Such a cleanup could as well be implemented as a seperate script or docker container. Would this be an option for you? I am hesitating to add more complexity to wordpress-backup. At least we should add unit testing then and perhaps migrate to a better testable language (python?). But I am not going to invest time into that right now.
Such a cleanup could as well be implemented as a seperate script or docker container. Would this be an option for you?
Yes, of course.
I am hesitating to add more complexity to wordpress-backup.
This seems reasonable to me. If this issue will get many likes in the future, the issue still can be reconsidered. Thanks for your answer and your docker image.
For those interested, the script I'm using is following backup_schedule.py
:
#!/usr/bin/python
import sys, os, re, datetime
if len(sys.argv) < 3:
print(
'usage:\n \033[1mbackup_schedule.py\033[22m /path/to/backup/folder x.y.z\n\n' +
'The scheme are the approximate distances between the kept backups.\n' +
'Files in the backup folder must be in the scheme *yyyymmdd*')
exit()
backups = {}
p = re.compile(r'[\d]{8}')
for file in os.listdir(sys.argv[1]):
m = p.search(file)
if m != None:
d = datetime.datetime.strptime( m.group(), "%Y%m%d" ).date()
if not d in backups:
backups[d] = {
'files': [],
'delete': True
}
backups[d]['files'].append(file)
periods = [int(x) for x in sys.argv[2].split('.')]
# keep today's backup
today = datetime.date.today()
if today in backups:
backups[today]['delete'] = False
# find out which other backups to keep
cursor = today
for p in periods:
cursor -= datetime.timedelta(days=p)
# find the best suited backup, i.e. the oldest backup that is not older than the specified period
best = min({k: v for k, v in backups.items() if k >= cursor}, default=None)
if best != None:
backups[best]['delete'] = False
# delete obsolete backups
for b in backups:
if backups[b]['delete']:
for file in backups[b]['files']:
os.remove(sys.argv[1] + '/' + file)
and I added following script backup_schedule
:
#!/bin/bash
chown -R 33:33 "/home/www-backup"
sudo -u "#33" python3 "<PWD>/backup_schedule.py" "/home/www-backup" 1.2.4.8.24
to cron.daily:
sed -e "s|<PWD>|$PWD|" backup_schedule > /tmp/backup_schedule
sudo cp /tmp/backup_schedule /etc/cron.daily/backup_schedule
sudo chmod +x /etc/cron.daily/backup_schedule
It seems to work fine for me.
On my server I have only limited space and thus I'd like to have an alternative cleanup solution to the 'older than x days' solution. The alternative should be tiered and keep backups based on different periods. Before creating a pull request I'd like to have some feedback on my suggested implementation. Maybe there is a better one from point of usability. What I had in mind:
A new cli argument like '-CLEANUP_SCHEDULE=1.1.1.4.4.30.30'
The resulting backups plan would look like:
The code would look like following:
The tiered backup schedule is borrowed from https://hub.docker.com/r/prodrigestivill/postgres-backup-local. The latter uses the cli options BACKUP_KEEP_DAYS, BACKUP_KEEP_WEEKS, BACKUP_KEEP_MONTHS. But that control creates some duplicate backups which is not very convenient for my very limited storage situation.