mozilla / bigquery-etl

Bigquery ETL
https://mozilla.github.io/bigquery-etl
Mozilla Public License 2.0
241 stars 98 forks source link

Recreating materialized views may cause occasional dry run failures on main #5817

Open BenWu opened 1 week ago

BenWu commented 1 week ago

With https://github.com/mozilla/bigquery-etl/pull/5640, we are now correctly recreating materialized views in bqetl_artifact_deployment but the first deploy after merging that seems to have caused a dry run failure for this commit on main https://github.com/mozilla/bigquery-etl/commit/19c3f02e1bd445eee978b8c01f5e567791fb9e5f.

The error was Not found: Table moz-fx-data-shared-prod:org_*******_ios_focus_derived.experiment_search_events_live_v1 was not found in location US which is a materialized view that was deployed at the same time as the dry run https://workflow.telemetry.mozilla.org/dags/bqetl_artifact_deployment/grid?dag_run_id=manual__2024-06-20T17%3A14%3A47.878027%2B00%3A00&task_id=publish_new_tables&tab=logs

There was a failure within an hour of the change so I would expect this to occur somewhat regularly and result in flaky CI.

I'm not sure if materialize views are only recreated if they've changed but that could be one way to reduce failures. A lot of the views needed to be recreated this time because they weren't getting updated before.

┆Issue is synchronized with this Jira Task

sean-rose commented 1 week ago

I'd bet that's due to materialized views being deleted in a separate step before being recreated, so for a period of time the materialized view wouldn't exist.

If we want materialized views to be recreated in this fashion maybe we should use CREATE OR REPLACE MATERIALIZED VIEW rather than CREATE MATERIALIZED VIEW IF NOT EXISTS (or use Jinja logic to conditionally generate the appropriate statement based on is_init()).