datamade / scrapers-us-municipal

Scrapers for US municipal governments.
MIT License
10 stars 8 forks source link

Update cron to scrape into Metro upgrade DB #42

Closed hancush closed 4 years ago

hancush commented 4 years ago

Description

This PR updates the cron to populate the Metro upgrade database, as well as the OCD API, following the pattern set in https://github.com/datamade/scrapers-us-municipal/pull/39.

Since there are so many different Metro commands, and because each job will involve at least four commands once there is a staging and production database (one command to scrape, one command to update the OCD API, one command to update the staging database, and one command to update the production database), I've broken the Metro cron tasks into descriptively named shell scripts that are run by the cron jobs. This will hopefully make the crons easier to read and debug.

Given the particularities of cron, I could use feedback on whether this will work as I expect it to, namely:

hancush commented 4 years ago

@fgregg Thanks for the reminder that I can do script-y things and the wonderful suggestions!

I added set -e and redirected STDERR to STDOUT in each script so we can capture the scrape progress as well as the result, but kept redirection into a particular log file in the cron, so it's more obvious where the logs are and easy to change on a per-task basis, if we want to.

If these look good to you, I'd like to deploy and monitor/troubleshoot them tomorrow. LMK!

hancush commented 4 years ago

Yahoo! I'll merge and deploy in the morning, when I have more brain power to troubleshoot in case something doesn't work.