HTTPArchive / bigquery

BigQuery import and processing pipelines
67 stars 20 forks source link

Update crontab to latest #181

Closed tunetheweb closed 1 year ago

tunetheweb commented 1 year ago

We've commented out the old pipeline jobs as now from https://github.com/HTTPArchive/data-pipeline

I've added a new entry to attempt to run the reports everyday of the month.

I've also added a line to run the reports on the 2nd. This is because the blink tables are updated on the 1st of the month by a BigQuery scheduled task. Traditionally we've just let them be picked up in next months run, but we can get the data sooner for those reports with this new entry. Will make a not in https://github.com/HTTPArchive/data-pipeline/issues/177 to see if we can tackle at same time.

And then there is the CrUX reports (which are not available until the 2nd Tuesday of the month - of which the 15th, when this cron is run, is guaranteed to be after that).