vfedotovs / sslv_web_scraper

ss.lv web scraping app helps automate information scraping and filtering from classifieds and emails results and stores scraped data in database
GNU General Public License v3.0
5 stars 3 forks source link

FEAT(CICD): Improve DB backup workflow in EC2 instance #287

Open vfedotovs opened 1 month ago

vfedotovs commented 1 month ago

Current behavior backup files do not contain hour_min_sec in filename 122712 Aug 8 08:05 pg_backup_2024_08_08.sql 122689 Aug 7 08:05 pg_backup_2024_08_07.sql

Backup does not have logic to upload last file

 crontab -l
5 6 * * * docker exec -t $(docker ps| grep db-1| awk '{print $NF}')  pg_dump -U  DB-USER -d DB-NAME > /tmp/pg_backup_$(date +\%Y_\%m_\%d).sql
7 6 * * * aws s3 cp /tmp/pg_backup_$(date +\%Y_\%m_\%d).sql  s3://bucket-name-pg-backups/pg_backup_$(date +\%Y_\%m_\%d).sql

Possible solution

Improved cron job version example 5 6 * \ docker exec -t $(docker ps --filter "name=db-1" --format "{{.Names}}") \ pg_dump -U DB-USER -d DB-NAME | gzip > /tmp/pgbackup$(date +\%Y\%m\%d).sql.gz \ 2>> /var/log/pg_backuperror.log && echo "$(date +\%Y\%m\%d\%H:%M:%S) Backup successful" >> /var/log/pg_backup_success.log

notes $(docker ps --filter "name=db-1" --format "{{.Names}}"): This refines the docker ps command to specifically target the container name that matches "db-1", which is more reliable than using grep.

gzip: The backup is compressed with gzip to save space.

/tmp/pgbackup$(date +\%Y\%m\%d).sql.gz: The backup file is saved with a .sql.gz extension to indicate that it’s compressed.

Error Logging: 2>> /var/log/pg_backup_error.log redirects any errors to a specific log file.

Success Logging: echo "$(date +\%Y\%m\%d_\%H:%M:%S) Backup successful" >> /var/log/pg_backup_success.log logs a success message along with a timestamp if the backup completes successfully.

Backup Retention Policy: Consider setting up a job to remove old backups after a certain period (e.g., 30 days).

bash Copy code 0 0 find /tmp/pgbackup.sql.gz -mtime +30 -exec rm {} \;