Closed sean-rose closed 1 week ago
metrics
fields to select in Glean union views."sql.diff
⚠️ Only part of the diff is displayed.
metrics
fields to select in Glean union views."sql.diff
⚠️ Only part of the diff is displayed.
I was worried doing this might cause BigQuery to always scan all metrics
columns since they're being restructured in the view query, but in a minimal test dry-running queries to count the number of rows with a particular boolean metric set to true BigQuery reported the same number of bytes scanned both with and without this change. So it appears BigQuery is successfully able to optimize through the restructuring logic and only scan the data that's needed (at least with simple STRUCT()
calls like in this case).
One thing we'll need to watch out for is this can make the view SQL significantly longer. For example, with this change sql/moz-fx-data-shared-prod/fenix/metrics/view.sql would increase from ~99k characters to ~225k characters, which would be getting close to BigQuery's 256k character limit for view SQL (I'm going to ask Google whether that limit can be increased).
You may have already figured this out but use_counters
is even bigger than the metrics view (783k characters). whd pointed out that we should be able to get jenkins to trigger artifact deployment soon so this might no longer be an issue soon, although it would still be a bit of ETL quirkiness https://bugzilla.mozilla.org/show_bug.cgi?id=1883727#c3
You may have already figured this out but
use_counters
is even bigger than the metrics view (783k characters). whd pointed out that we should be able to get jenkins to trigger artifact deployment soon so this might no longer be an issue soon, although it would still be a bit of ETL quirkiness https://bugzilla.mozilla.org/show_bug.cgi?id=1883727#c3
Yeah, with this change fenix.use_counters
ends up being 783k characters and focus_android.use_counters
ends up being 469k characters. I missed those initially because I was only running the glean_usage
generator locally without running the stable_views
generator first.
I'm going to add a workaround for that, then ping you and @scholtzan for re-review, and y'all can weigh in on whether it's worth merging this, or if it would be better to just wait for the Jenkins-triggered artifact deployment.
metrics
in Glean union views if the generated SQL exceeds the character limit."sql.diff
⚠️ Only part of the diff is displayed.
sql.diff
⚠️ Only part of the diff is displayed.
sql.diff
⚠️ Only part of the diff is displayed.
To try to avoid problems when the underlying table/view schemas change due to new metrics being added.
Checklist for reviewer:
<username>:<branch>
of the fork as parameter. The parameter will also show up in the logs of themanual-trigger-required-for-fork
CI task together with more detailed instructions.For modifications to schemas in restricted namespaces (see
CODEOWNERS
):┆Issue is synchronized with this Jira Task