Closed stephen-soltesz closed 2 years ago
Manual repair using bq
-- the sandbox table was recreated with correct schema. Export that schema and update the staging and prod tables to match.
# Guarantee that the sandbox table has been recreated by first removing it and allowing gardener to recreate it.
bq rm mlab-sandbox:ndt.ndt7
# After the table exists again, use it as a reference for updating later tables.
bq show --format=prettyjson mlab-sandbox:ndt.ndt7 | jq .schema.fields > ndt7.schema
bq update mlab-staging:ndt.ndt7 ndt7.schema
bq update mlab-oti:ndt.ndt7 ndt7.schema
In progress: create alerts using comparison of metrics like:
bq_daily_archive_count{datatype=~"ndt7|annotation"}
and,
increase(gcs_archive_files_total{bucket="archive-measurement-lab", experiment="ndt", datatype=~"ndt7|annotation"}[1d] offset 2d)
An additional QueryConfig option in https://github.com/m-lab/etl-gardener/blob/d9582f1131b5d978adbe64ffd341bfd47fda8718/cloud/bq/ops.go#L260-L274
Can automatically allow field addition:
SchemaUpdateOptions: []string{"ALLOW_FIELD_ADDITION", "ALLOW_FIELD_RELAXATION"},
An undetected failure in staging since XXX and production since 01/25 is a result of additional fields added to tcpinfo but missing in the materialized ndt7 tables. The net result is failure to join the annotation and raw_ndt.ndt7 data b/c the destination table schemas did not match the source schemas.
Example error log:
This is a fundamental problem. The etl/cmd/update-schema command operates on the raw tables for primary datatypes. The joined tables have derived schemas that are a combination of the input tables.