mediacloud / rss-fetcher

Intelligently fetch lists of URLs from a large collection of RSS Feeds as part of the Media Cloud Directory.
https://search.mediacloud.org/directory
Apache License 2.0
5 stars 5 forks source link

setup automated rss file backups to S3 #4

Closed rahulbot closed 2 years ago

rahulbot commented 2 years ago

We need to backup the RSS files, and make them publicly available.

rahulbot commented 2 years ago

First I changed it so the RSS target dir is set from an env-var in 107f284. Then I made a volume on the server: sudo mkdir /space/mediacloud/backup-rss-files/ And mounted it: dokku storage:mount rss-fetcher /space/mediacloud/backup-rss-files:/app/storage/ And connected it to the app: dokku config:set --no-restart rss-fetcher RSS_FILE_PATH=/app/storage/ Now generated rss files are saved to /space/mediacloud/backup-rss-files so they are outside the container.

rahulbot commented 2 years ago

Following example of https://github.com/mediacloud/backend/issues/832, I set it up to with aws CLI to sync to S3:

aws s3 sync /space/mediacloud/backup-rss-files/ s3://mediacloud-public/backup-daily-rss/

This should run once a night via cron.