Full event and bill scraping start later on Friday and continue through Saturday.
We want full event and bill scraping to occur during the second half of Friday, i.e., when Metro adds event data. The server uses UTC: this makes things a little complicated! Ceasing aggressive scraping at, say, 11:45 pm UTC would be 4:45 pm PT (that's often before Metro adds data). I thus adjusted the crons to continue with full scrapes through Saturday. Why?
it does not add even greater complexity to our crons (i.e., we do not need to split Saturday into two halves as well)
it is low risk, since Metro does not port data on Saturday. If Legistar complains, it likely would not affect the health of our data pipeline.
Full scrapes at the top of the hour; windowed scrapes follow.
I changed the cron schedule, so that full scrapes occur at 0 and 5 after the hour, and windowed scrapes follow: two for bills, and two for events.
Improved readability
Variables! @hancush - would it also be worth adding some robust comments? or is that too didactic?
This PR adjusts the Metro cron per our conversation in https://github.com/datamade/la-metro-councilmatic/issues/419. Summary of the changes:
Full event and bill scraping start later on Friday and continue through Saturday. We want full event and bill scraping to occur during the second half of Friday, i.e., when Metro adds event data. The server uses UTC: this makes things a little complicated! Ceasing aggressive scraping at, say, 11:45 pm UTC would be 4:45 pm PT (that's often before Metro adds data). I thus adjusted the crons to continue with full scrapes through Saturday. Why?
Full scrapes at the top of the hour; windowed scrapes follow. I changed the cron schedule, so that full scrapes occur at 0 and 5 after the hour, and windowed scrapes follow: two for bills, and two for events.
Improved readability Variables! @hancush - would it also be worth adding some robust comments? or is that too didactic?