Open gfr10598 opened 3 years ago
Based on info in README.md, added create-cluster.sh in new branch, which has all the gcloud commands to set up the network, subnet, firewall rules, cluster, and node-pools.
Manually added cloud build trigger. Note that gcloud beta builds now supports creating triggers, too.
gcloud beta builds triggers create github \ --repo-name=[REPO_NAME] \ --repo-owner=[REPO_OWNER] \ --branch-pattern=".*" \ --build-config=[BUILD_CONFIG_FILE] \
bq --project=mlab-oti mk tmp_ndt bq --project=mlab-oti mk raw_ndt
Need to add the table creation and schema updates to etl-schema.
CREATE OR REPLACE TABLE mlab-oti.raw_ndt.ndt7
PARTITION BY date CLUSTER BY metro
AS
SELECT date, REGEXP_EXTRACT(parser.ArchiveURL , ".-mlab[1-4]-([a-z]{3})[0-9]{2}.") AS metro, id, * EXCEPT(date,id)
FROM mlab-sandbox.tmp_ndt.ndt7
WHERE date > CURRENT_DATE()
CREATE OR REPLACE TABLE mlab-oti.raw_ndt.annotation
PARTITION BY date CLUSTER BY metro
AS
SELECT date, REGEXP_EXTRACT(parser.ArchiveURL , ".-mlab[1-4]-([a-z]{3})[0-9]{2}.") AS metro, id, * EXCEPT(date,id)
FROM mlab-sandbox.tmp_ndt.annotation
WHERE date > CURRENT_DATE()
NOTE: bigquery does not store data in us-central. This may mean that we will get network egress charges for the BQ loads?
Probably should specify the BQ dataset data_location=US to make it multi-regional. See https://cloud.google.com/bigquery/docs/locations#multi-regional-locations
The documentation is not crystal clear, so we should probably just look for these charges in billing.
Looks like prod mostly runs in us-central instead of east region. So the new k8s cluster should probably be there too.
There is some documentation in the README.md file from January.
Steps: