Add GitHub actions: archives, cronjobs

After discussion, here are the initial 3 workflows we will do:

Store the latest dataset on approval:
- For each added or modified file under catalogs/sources/gtfs/schedule:
  - If the auto-discovery URL is a readable dataset, download and store the dataset in the bucket using the source filename. Otherwise, raise an error to prevent merging the PR.
  - Use the latest URL to test downloading the latest dataset
  - If the dataset downloaded from the latest url is not readable, raise an error to prevent merging the PR.
Store the latest dataset using a daily cronjob:
- For each file under catalogs/sources/gtfs/schedule:
  - Download the dataset using the auto-discovery URL
  - If the auto-discovery URL is a readable dataset, download and store the dataset in the bucket using the source filename. Otherwise, don't update the latest url in the bucket and add a problem to the cronjob report.
  - If updated, use the latest URL to test downloading the latest dataset.
  - If the dataset downloaded from the latest url is not readable, add a problem to the cronjob report.
Detect deleted and renamed files on PR(nice-to-have):
- if at least one file has been deleted or renamed under catalogs/sources/gtfs/schedule, raise an error to prevent merging the PR.

For the first workflow (1), the datasets uploaded to the bucket will overwrite the previous ones because we are using the source filename to identify the datasets. This is okay since we will make sure adding a new source will not overwrite another source in the catalogs.

MobilityData / mobility-database-catalogs

Add GitHub actions: archives, cronjobs #26