Closed jklukas closed 3 years ago
See the change I'm attempting in https://github.com/mozilla/bigquery-etl/compare/background_update?expand=1
Did you try undoing all changes in clients_daily_v6/schema.yaml
and then run the command again? What happens is that if clients_daily_v6/schema.yaml
got updated earlier, and then the update
command gets executed again it appears as if no new changes have been made to the schema and downstream dependencies won't get analyzed.
Did you try undoing all changes in
clients_daily_v6/schema.yaml
and then run the command again? What happens is that ifclients_daily_v6/schema.yaml
got updated earlier, and then theupdate
command gets executed again it appears as if no new changes have been made to the schema and downstream dependencies won't get analyzed.
You were correct on this. I undid the schema changes, and it determined dependencies correctly, but looks to be hitting other errors now:
Update downstream dependency schema for sql/moz-fx-data-shared-prod/internet_outages/global_outages_v1/query.sql
Update downstream dependency schema for sql/moz-fx-data-shared-prod/messaging_system/onboarding_retention_events_amplitude/view.sql
Update downstream dependency schema for sql/moz-fx-data-shared-prod/search_derived/search_clients_daily_v8/query.sql
Update downstream dependency schema for sql/moz-fx-data-shared-prod/telemetry/clients_daily/view.sql
Update downstream dependency schema for sql/moz-fx-data-shared-prod/telemetry/clients_daily_v6/view.sql
Update downstream dependency schema for sql/moz-fx-data-shared-prod/telemetry/windows_10_aggregate/view.sql
Update downstream dependency schema for sql/moz-fx-data-shared-prod/telemetry/windows_10_build_distribution/view.sql
Update downstream dependency schema for sql/moz-fx-data-shared-prod/telemetry/windows_10_patch_adoption/view.sql
Update downstream dependency schema for sql/moz-fx-data-shared-prod/telemetry_derived/clients_daily_joined_v1/query.sql
Update downstream dependency schema for sql/moz-fx-data-shared-prod/telemetry_derived/clients_first_seen_v1/query.sql
Update downstream dependency schema for sql/moz-fx-data-shared-prod/telemetry_derived/clients_last_seen_v1/query.sql
Update downstream dependency schema for sql/moz-fx-data-shared-prod/telemetry_derived/devtools_panel_usage_v1/query.sql
Update downstream dependency schema for sql/moz-fx-data-shared-prod/telemetry_derived/experiments_daily_active_clients_v1/query.sql
Update downstream dependency schema for sql/moz-fx-data-shared-prod/telemetry_derived/italy_covid19_outage_v1/query.sql
sql/moz-fx-data-shared-prod/internet_outages/global_outages_v1/query.sql OK
moz-fx-data-shared-prod/internet_outages/global_outages_v1/query.sql OK
Schema sql/moz-fx-data-shared-prod/internet_outages/global_outages_v1/schema.yaml updated.
No metadata defined for sql/moz-fx-data-shared-prod/messaging_system/onboarding_retention_events_amplitude/view.sql
sql/moz-fx-data-shared-prod/messaging_system/onboarding_retention_events_amplitude/view.sql OK
moz-fx-data-shared-prod/messaging_system/onboarding_retention_events_amplitude/query.sql ERROR
[{'code': 400, 'errors': [{'message': "Cannot query over table 'moz-fx-data-shared-prod.messaging_system_derived.onboarding_users_last_seen_v1' without a filter over column(s) 'submission_date' that can be used for partition elimination", 'domain': 'global', 'reason': 'invalidQuery', 'location': 'q', 'locationType': 'parameter'}], 'response': {'headers': {'alt-svc': 'h3=":443"; ma=2592000,h3-29=":443"; ma=2592000,h3-T051=":443"; ma=2592000,h3-Q050=":443"; ma=2592000,h3-Q046=":443"; ma=2592000,h3-Q043=":443"; ma=2592000,quic=":443"; ma=2592000; v="46,43"', 'cache-control': 'private', 'content-encoding': 'gzip', 'content-type': 'application/json; charset=UTF-8', 'date': 'Fri, 20 Aug 2021 19:13:34 GMT', 'server': 'ESF', 'transfer-encoding': 'chunked', 'vary': 'Origin, X-Origin, Referer', 'x-content-type-options': 'nosniff', 'x-frame-options': 'SAMEORIGIN', 'x-xss-protection': '0'}}, 'message': "Cannot query over table 'moz-fx-data-shared-prod.messaging_system_derived.onboarding_users_last_seen_v1' without a filter over column(s) 'submission_date' that can be used for partition elimination"}]
Cannot get schema for moz-fx-data-shared-prod.messaging_system.onboarding_retention_events_amplitude: Error when dry running SQL file moz-fx-data-shared-prod/messaging_system/onboarding_retention_events_amplitude/query.sql
Traceback (most recent call last):
File "/Users/jeffklukas/Code/zbqetl/venv/bin/bqetl", line 33, in <module>
sys.exit(load_entry_point('mozilla-bigquery-etl', 'console_scripts', 'bqetl')())
File "/Users/jeffklukas/Code/zbqetl/bigquery_etl/cli/__init__.py", line 56, in cli
group(prog_name=prog_name)
File "/Users/jeffklukas/Code/zbqetl/venv/lib/python3.9/site-packages/click/core.py", line 1137, in __call__
return self.main(*args, **kwargs)
File "/Users/jeffklukas/Code/zbqetl/venv/lib/python3.9/site-packages/click/core.py", line 1062, in main
rv = self.invoke(ctx)
File "/Users/jeffklukas/Code/zbqetl/venv/lib/python3.9/site-packages/click/core.py", line 1668, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/Users/jeffklukas/Code/zbqetl/venv/lib/python3.9/site-packages/click/core.py", line 1668, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/Users/jeffklukas/Code/zbqetl/venv/lib/python3.9/site-packages/click/core.py", line 1668, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/Users/jeffklukas/Code/zbqetl/venv/lib/python3.9/site-packages/click/core.py", line 1404, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/Users/jeffklukas/Code/zbqetl/venv/lib/python3.9/site-packages/click/core.py", line 763, in invoke
return __callback(*args, **kwargs)
File "/Users/jeffklukas/Code/zbqetl/bigquery_etl/cli/query.py", line 791, in update
changed = _update_query_schema(
File "/Users/jeffklukas/Code/zbqetl/bigquery_etl/cli/query.py", line 1008, in _update_query_schema
query_schema.merge(table_schema)
File "/Users/jeffklukas/Code/zbqetl/bigquery_etl/schema/__init__.py", line 97, in merge
self.schema["fields"],
KeyError: 'fields'
I believe the just-merged #2275 may fix the "fields" error.
And I just filed https://github.com/mozilla/bigquery-etl/pull/2279 to remove unneeded Amplitude views, which should fix the onboarding_retention_events_amplitude/query.sql ERROR
.
I believe the just-merged #2275 may fix the "fields" error.
Looking at this more closely, it's not the glam queries causing this error. It's coming from query_schema.merge
, where the query_schema
I believe is empty because a dry run failed.
I think the update_downstream
logic is not supposed to consider view files, but it's processing views and getting confused with the missing schemas.
You are right. https://github.com/mozilla/bigquery-etl/pull/2280 will ensure that only query.sql
files are considered
I am attempting to add a new field to
clients_daily_v6
following the common workflows documentation.I've added the new field to the query, but running the schema update only updates
clients_daily
:I was expecting several downstream tables to get updated. I'm wondering if this is a regression related to changes in https://github.com/mozilla/bigquery-etl/pull/2244
cc @scholtzan