I've added a new entry to attempt to run the reports everyday of the month.
This will exit early if the reports do not exist.
It WILL however run each day AFTER the reports are run, but not find any reports to run. a Bit of a waste, but not sure it's worth catering for that as we hope to migrate off of this in https://github.com/HTTPArchive/data-pipeline/issues/177 ?
I've also added a line to run the reports on the 2nd. This is because the blink tables are updated on the 1st of the month by a BigQuery scheduled task. Traditionally we've just let them be picked up in next months run, but we can get the data sooner for those reports with this new entry. Will make a not in https://github.com/HTTPArchive/data-pipeline/issues/177 to see if we can tackle at same time.
And then there is the CrUX reports (which are not available until the 2nd Tuesday of the month - of which the 15th, when this cron is run, is guaranteed to be after that).
We've commented out the old pipeline jobs as now from https://github.com/HTTPArchive/data-pipeline
I've added a new entry to attempt to run the reports everyday of the month.
I've also added a line to run the reports on the 2nd. This is because the blink tables are updated on the 1st of the month by a BigQuery scheduled task. Traditionally we've just let them be picked up in next months run, but we can get the data sooner for those reports with this new entry. Will make a not in https://github.com/HTTPArchive/data-pipeline/issues/177 to see if we can tackle at same time.
And then there is the CrUX reports (which are not available until the 2nd Tuesday of the month - of which the 15th, when this cron is run, is guaranteed to be after that).