Weird versions like 999 and 1024 are part of the reason why it's hard to analyze graphs for some Glean metrics. This PR ensures that the ETL won't process versions above the latest_version.
I'll manually remove data with invalid versions from the current production dataset.
Checklist for reviewer:
[ ] Commits should reference a bug or github issue, if relevant (if a bug is referenced, the pull request should include the bug number in the title).
[ ] If the PR comes from a fork, trigger integration CI tests by running the Push to upstream workflow and provide the <username>:<branch> of the fork as parameter. The parameter will also show up
in the logs of the manual-trigger-required-for-fork CI task together with more detailed instructions.
[ ] If adding a new field to a query, ensure that the schema and dependent downstream schemas have been updated.
[ ] When adding a new derived dataset, ensure that data is not available already (fully or partially) and recommend extending an existing dataset in favor of creating new ones. Data can be available in the bigquery-etl repository, looker-hub or in looker-spoke-default.
For modifications to schemas in restricted namespaces (see CODEOWNERS):
Weird versions like
999
and1024
are part of the reason why it's hard to analyze graphs for some Glean metrics. This PR ensures that the ETL won't process versions above thelatest_version
. I'll manually remove data with invalid versions from the current production dataset.Checklist for reviewer:
<username>:<branch>
of the fork as parameter. The parameter will also show up in the logs of themanual-trigger-required-for-fork
CI task together with more detailed instructions.For modifications to schemas in restricted namespaces (see
CODEOWNERS
):┆Issue is synchronized with this Jira Task