mozilla / bigquery-etl

Bigquery ETL
https://mozilla.github.io/bigquery-etl
Mozilla Public License 2.0
243 stars 98 forks source link

GLAM rewrite JS UDFs in SQL #5697

Closed edugfilho closed 2 weeks ago

edugfilho commented 1 month ago

Fixes https://github.com/mozilla/glam/issues/2840 @BenWu's suggestion in the issue above was to set up small tests where we'd replace the JS UDFs with something that does very little and see at best the potential amount of performance to gain.

Despite agreeing with the approach I thought it'd be a good idea to make the conversions at once, especially for functional_buckets, which will probably be more used once Glean stops sending 0-count buckets. And since there are some tests in the UDFs, I decided to take Ben's samples and converted the other GLAM UDFs. In fact there's only one UDF that hasn't been converted to SQL, because it's not trivial (here in case someone wants to take a shot at it)

Checklist for reviewer:

For modifications to schemas in restricted namespaces (see CODEOWNERS):

┆Issue is synchronized with this Jira Task

dataops-ci-bot commented 1 month ago

Integration report for "formatting"

sql.diff

Click to expand! ```diff Only in /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/fenix_derived/retention_v1: backfill.yaml Only in /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/firefox_ios_derived/retention_v1: backfill.yaml diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/mozfun/glam/histogram_cast_json/udf.sql /tmp/workspace/generated-sql/sql/mozfun/glam/histogram_cast_json/udf.sql --- /tmp/workspace/main-generated-sql/sql/mozfun/glam/histogram_cast_json/udf.sql 2024-05-31 16:15:30.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/mozfun/glam/histogram_cast_json/udf.sql 2024-05-31 16:15:55.000000000 +0000 @@ -2,21 +2,24 @@ CREATE OR REPLACE FUNCTION glam.histogram_cast_json( histogram ARRAY> ) -RETURNS STRING DETERMINISTIC -LANGUAGE js -AS - ''' - let obj = {}; - histogram.map(r => { - obj[r.key] = parseFloat(r.value.toFixed(4)); - }); - return JSON.stringify(obj); -'''; +RETURNS STRING AS ( + IF( + ARRAY_LENGTH(histogram) = 0, + "{}", + TO_JSON_STRING( + JSON_OBJECT( + ARRAY(SELECT key FROM UNNEST(histogram)), + ARRAY(SELECT ROUND(value, 4) FROM UNNEST(histogram)) + ) + ) + ) +); SELECT assert.equals( - '{"0":0.1111,"1":0.6667,"2":0}', + '{"0":0.1111,"1":0.6667,"2":0.0}', glam.histogram_cast_json( ARRAY>[("0", 0.111111), ("1", 2.0 / 3), ("2", 0)] ) - ) + ), + assert.equals('{}', glam.histogram_cast_json(ARRAY>[])), diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/mozfun/glam/histogram_generate_functional_buckets/udf.sql /tmp/workspace/generated-sql/sql/mozfun/glam/histogram_generate_functional_buckets/udf.sql --- /tmp/workspace/main-generated-sql/sql/mozfun/glam/histogram_generate_functional_buckets/udf.sql 2024-05-31 16:15:30.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/mozfun/glam/histogram_generate_functional_buckets/udf.sql 2024-05-31 16:15:55.000000000 +0000 @@ -4,34 +4,29 @@ buckets_per_magnitude INT64, range_max INT64 ) -RETURNS ARRAY DETERMINISTIC -LANGUAGE js -AS - ''' - function sample_to_bucket_index(sample) { - // Get the index of the sample - // https://github.com/mozilla/glean/blob/main/glean-core/src/histogram/functional.rs - let exponent = Math.pow(log_base, 1.0/buckets_per_magnitude); - return Math.ceil(Math.log(sample + 1) / Math.log(exponent)); - } - - let buckets = new Set([0]); - for (let index = 0; index < sample_to_bucket_index(range_max); index++) { - - // Avoid re-using the exponent due to floating point issues when carrying - // the `pow` operation e.g. `let exponent = ...; Math.pow(exponent, index)`. - let bucket = Math.floor(Math.pow(log_base, index/buckets_per_magnitude)); - - // NOTE: the sample_to_bucket_index implementation overshoots the true index, - // so we break out early if we hit the max bucket range. - if (bucket > range_max) { - break; - } - buckets.add(bucket); - } - - return [...buckets] -'''; +RETURNS ARRAY AS ( + ( + WITH bucket_indexes AS ( + -- Generate all bucket indexes + -- https://github.com/mozilla/glean/blob/main/glean-core/src/histogram/functional.rs + SELECT + GENERATE_ARRAY(0, CEIL(LOG(range_max + 1, log_base) * buckets_per_magnitude)) AS indexes + ), + buckets AS ( + SELECT + FLOOR(POW(log_base, (idx) / buckets_per_magnitude)) AS bucket + FROM + bucket_indexes, + UNNEST(indexes) AS idx + WHERE + FLOOR(POW(log_base, (idx) / buckets_per_magnitude)) <= range_max + ) + SELECT + ARRAY_CONCAT([0.0], ARRAY_AGG(DISTINCT(bucket) ORDER BY bucket)) + FROM + buckets + ) +); SELECT -- First 50 keys of a timing distribution diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/mozfun/glam/histogram_generate_linear_buckets/udf.sql /tmp/workspace/generated-sql/sql/mozfun/glam/histogram_generate_linear_buckets/udf.sql --- /tmp/workspace/main-generated-sql/sql/mozfun/glam/histogram_generate_linear_buckets/udf.sql 2024-05-31 16:15:30.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/mozfun/glam/histogram_generate_linear_buckets/udf.sql 2024-05-31 16:15:55.000000000 +0000 @@ -4,17 +4,17 @@ max FLOAT64, nBuckets FLOAT64 ) -RETURNS ARRAY DETERMINISTIC -LANGUAGE js -AS - ''' - let result = [0]; - for (let i = 1; i < Math.min(nBuckets, max, 10000); i++) { - let linearRange = (min * (nBuckets - 1 - i) + max * (i - 1)) / (nBuckets - 2); - result.push(Math.round(linearRange)); - } - return result; -'''; +RETURNS ARRAY AS ( + ARRAY_CONCAT( + [0.0], + ARRAY( + SELECT + ROUND((min * (nBuckets - 1 - i) + max * (i - 1)) / (nBuckets - 2)) + FROM + UNNEST(GENERATE_ARRAY(1, LEAST(nBuckets - 1, max, 10000))) AS i + ) + ) +); SELECT -- Buckets of CONTENT_FRAME_TIME_VSYNC diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/mozfun/glam/histogram_generate_scalar_buckets/udf.sql /tmp/workspace/generated-sql/sql/mozfun/glam/histogram_generate_scalar_buckets/udf.sql --- /tmp/workspace/main-generated-sql/sql/mozfun/glam/histogram_generate_scalar_buckets/udf.sql 2024-05-31 16:15:30.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/mozfun/glam/histogram_generate_scalar_buckets/udf.sql 2024-05-31 16:15:55.000000000 +0000 @@ -4,17 +4,14 @@ max_bucket FLOAT64, num_buckets INT64 ) -RETURNS ARRAY DETERMINISTIC -LANGUAGE js -AS - ''' - let bucket_size = (max_bucket - min_bucket) / num_buckets; - let buckets = new Set(); - for (let bucket = min_bucket; bucket < max_bucket; bucket += bucket_size) { - buckets.add(Math.pow(2, bucket).toFixed(2)); - } - return Array.from(buckets); -'''; +RETURNS ARRAY AS ( + ARRAY( + SELECT + ROUND(POW(2, (max_bucket - min_bucket) / num_buckets * val), 2) + FROM + UNNEST(GENERATE_ARRAY(0, num_buckets - 1)) AS val + ) +); SELECT assert.array_equals([1, 2, 4, 8], glam.histogram_generate_scalar_buckets(0, LOG(16, 2), 4)), diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/mozfun/glam/percentile/udf.sql /tmp/workspace/generated-sql/sql/mozfun/glam/percentile/udf.sql --- /tmp/workspace/main-generated-sql/sql/mozfun/glam/percentile/udf.sql 2024-05-31 16:15:30.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/mozfun/glam/percentile/udf.sql 2024-05-31 16:15:55.000000000 +0000 @@ -1,37 +1,41 @@ -- udf_js.glean_percentile CREATE OR REPLACE FUNCTION glam.percentile( - percentile FLOAT64, + pct FLOAT64, histogram ARRAY>, type STRING ) -RETURNS FLOAT64 DETERMINISTIC -LANGUAGE js -AS - ''' - if (percentile < 0 || percentile > 100) { - throw "percentile must be a value between 0 and 100"; - } - - let values = histogram.map(bucket => bucket.value); - let total = values.reduce((a, b) => a + b); - let normalized = values.map(value => value / total); - - // Find the index into the cumulative distribution function that corresponds - // to the percentile. This undershoots the true value of the percentile. - let acc = 0; - let index = null; - for (let i = 0; i < normalized.length; i++) { - acc += normalized[i]; - index = i; - if (acc >= percentile / 100) { - break; - } - } - - // NOTE: we do not perform geometric or linear interpolation, but this would - // be the place to implement it. - return histogram[index].key; -'''; +RETURNS FLOAT64 AS ( + ( + WITH check AS ( + SELECT + IF( + pct >= 0 + AND pct <= 100, + TRUE, + ERROR('percentile must be a value between 0 and 100') + ) pct_ok + ), + keyed_cum_sum AS ( + SELECT + key, + SUM(value) OVER (ORDER BY key) / SUM(value) OVER () AS cum_sum + FROM + UNNEST(histogram) + ) + SELECT + CAST(key AS FLOAT64) + FROM + keyed_cum_sum, + check + WHERE + check.pct_ok + AND cum_sum >= pct / 100 + ORDER BY + cum_sum + LIMIT + 1 + ) +); SELECT assert.equals( @@ -41,6 +45,22 @@ ARRAY>[("0", 1), ("2", 2), ("3", 1)], "timing_distribution" ) + ), + assert.equals( + 3, + glam.percentile( + 100.0, + ARRAY>[("0", 1), ("2", 2), ("3", 1)], + "timing_distribution" + ) + ), + assert.equals( + 0, + glam.percentile( + 0.0, + ARRAY>[("0", 1), ("2", 2), ("3", 1)], + "timing_distribution" + ) ); #xfail diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/bedrock_derived/event_monitoring_live_v1/materialized_view.sql /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/bedrock_derived/event_monitoring_live_v1/materialized_view.sql --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/bedrock_derived/event_monitoring_live_v1/materialized_view.sql 2024-05-31 16:16:10.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/bedrock_derived/event_monitoring_live_v1/materialized_view.sql 2024-05-31 16:17:41.000000000 +0000 @@ -50,7 +50,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.bedrock_live.non_interaction_v1` + `moz-fx-data-shared-prod.bedrock_live.events_v1` UNION ALL SELECT submission_timestamp, @@ -70,7 +70,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.bedrock_live.events_v1` + `moz-fx-data-shared-prod.bedrock_live.non_interaction_v1` ) CROSS JOIN UNNEST(events) AS event, diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/contextual_services/event_aggregates/schema.yaml /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/contextual_services/event_aggregates/schema.yaml --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/contextual_services/event_aggregates/schema.yaml 2024-05-31 16:15:29.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/contextual_services/event_aggregates/schema.yaml 2024-05-31 16:23:48.000000000 +0000 @@ -1,49 +1,49 @@ fields: -- mode: NULLABLE - name: submission_date +- name: submission_date type: DATE -- mode: NULLABLE - name: source + mode: NULLABLE +- name: source type: STRING -- mode: NULLABLE - name: event_type + mode: NULLABLE +- name: event_type type: STRING -- mode: NULLABLE - name: form_factor + mode: NULLABLE +- name: form_factor type: STRING -- mode: NULLABLE - name: country + mode: NULLABLE +- name: country type: STRING -- mode: NULLABLE - name: subdivision1 + mode: NULLABLE +- name: subdivision1 type: STRING -- mode: NULLABLE - name: advertiser + mode: NULLABLE +- name: advertiser type: STRING -- mode: NULLABLE - name: release_channel + mode: NULLABLE +- name: release_channel type: STRING -- mode: NULLABLE - name: position + mode: NULLABLE +- name: position type: INTEGER -- mode: NULLABLE - name: provider + mode: NULLABLE +- name: provider type: STRING -- mode: NULLABLE - name: match_type + mode: NULLABLE +- name: match_type type: STRING -- mode: NULLABLE - name: normalized_os + mode: NULLABLE +- name: normalized_os type: STRING -- mode: NULLABLE - name: suggest_data_sharing_enabled + mode: NULLABLE +- name: suggest_data_sharing_enabled type: BOOLEAN -- mode: NULLABLE - name: event_count + mode: NULLABLE +- name: event_count type: INTEGER -- mode: NULLABLE - name: user_count + mode: NULLABLE +- name: user_count type: INTEGER -- mode: NULLABLE - name: query_type + mode: NULLABLE +- name: query_type type: STRING + mode: NULLABLE diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/contextual_services/event_aggregates_suggest/schema.yaml /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/contextual_services/event_aggregates_suggest/schema.yaml --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/contextual_services/event_aggregates_suggest/schema.yaml 2024-05-31 16:15:29.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/contextual_services/event_aggregates_suggest/schema.yaml 2024-05-31 16:23:48.000000000 +0000 @@ -1,40 +1,40 @@ fields: -- mode: NULLABLE - name: submission_date +- name: submission_date type: DATE -- mode: NULLABLE - name: form_factor + mode: NULLABLE +- name: form_factor type: STRING -- mode: NULLABLE - name: country + mode: NULLABLE +- name: country type: STRING -- mode: NULLABLE - name: advertiser + mode: NULLABLE +- name: advertiser type: STRING -- mode: NULLABLE - name: normalized_os + mode: NULLABLE +- name: normalized_os type: STRING -- mode: NULLABLE - name: release_channel + mode: NULLABLE +- name: release_channel type: STRING -- mode: NULLABLE - name: position + mode: NULLABLE +- name: position type: INTEGER -- mode: NULLABLE - name: provider + mode: NULLABLE +- name: provider type: STRING -- mode: NULLABLE - name: match_type + mode: NULLABLE +- name: match_type type: STRING -- mode: NULLABLE - name: suggest_data_sharing_enabled + mode: NULLABLE +- name: suggest_data_sharing_enabled type: BOOLEAN -- mode: NULLABLE - name: impression_count + mode: NULLABLE +- name: impression_count type: INTEGER -- mode: NULLABLE - name: click_count + mode: NULLABLE +- name: click_count type: INTEGER -- mode: NULLABLE - name: query_type + mode: NULLABLE +- name: query_type type: STRING + mode: NULLABLE diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/fenix/funnel_retention_clients/schema.yaml /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/fenix/funnel_retention_clients/schema.yaml --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/fenix/funnel_retention_clients/schema.yaml 2024-05-31 16:15:29.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/fenix/funnel_retention_clients/schema.yaml 2024-05-31 16:23:00.000000000 +0000 @@ -26,6 +26,9 @@ - name: adjust_network type: STRING mode: NULLABLE +- name: install_source + type: STRING + mode: NULLABLE - name: retained_week_2 type: BOOLEAN mode: NULLABLE diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/fenix/funnel_retention_week_4/schema.yaml /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/fenix/funnel_retention_week_4/schema.yaml --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/fenix/funnel_retention_week_4/schema.yaml 2024-05-31 16:15:29.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/fenix/funnel_retention_week_4/schema.yaml 2024-05-31 16:23:00.000000000 +0000 @@ -48,6 +48,10 @@ description: 'The type of source of a client installation. ' +- name: install_source + type: STRING + mode: NULLABLE + description: null - name: new_profiles type: INTEGER mode: NULLABLE diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/fenix_derived/retention_v1/backfill.yaml /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/fenix_derived/retention_v1/backfill.yaml --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/fenix_derived/retention_v1/backfill.yaml 2024-05-31 16:15:30.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/fenix_derived/retention_v1/backfill.yaml 1970-01-01 00:00:00.000000000 +0000 @@ -1,7 +0,0 @@ -2024-05-31: - start_date: 2021-01-01 - end_date: 2024-05-31 - reason: The table is created, this is to populate it with data. - watchers: - - kik@mozilla.com - status: Initiate diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/firefox_desktop_background_tasks_derived/event_monitoring_live_v1/materialized_view.sql /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/firefox_desktop_background_tasks_derived/event_monitoring_live_v1/materialized_view.sql --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/firefox_desktop_background_tasks_derived/event_monitoring_live_v1/materialized_view.sql 2024-05-31 16:16:09.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/firefox_desktop_background_tasks_derived/event_monitoring_live_v1/materialized_view.sql 2024-05-31 16:17:41.000000000 +0000 @@ -50,7 +50,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.firefox_desktop_background_tasks_live.background_tasks_v1` + `moz-fx-data-shared-prod.firefox_desktop_background_tasks_live.events_v1` UNION ALL SELECT submission_timestamp, @@ -60,7 +60,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.firefox_desktop_background_tasks_live.events_v1` + `moz-fx-data-shared-prod.firefox_desktop_background_tasks_live.background_tasks_v1` ) CROSS JOIN UNNEST(events) AS event, diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/firefox_desktop_derived/event_monitoring_live_v1/materialized_view.sql /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/firefox_desktop_derived/event_monitoring_live_v1/materialized_view.sql --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/firefox_desktop_derived/event_monitoring_live_v1/materialized_view.sql 2024-05-31 16:16:10.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/firefox_desktop_derived/event_monitoring_live_v1/materialized_view.sql 2024-05-31 16:17:42.000000000 +0000 @@ -50,7 +50,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.firefox_desktop_live.prototype_no_code_events_v1` + `moz-fx-data-shared-prod.firefox_desktop_live.newtab_v1` UNION ALL SELECT submission_timestamp, @@ -60,7 +60,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.firefox_desktop_live.urlbar_potential_exposure_v1` + `moz-fx-data-shared-prod.firefox_desktop_live.prototype_no_code_events_v1` UNION ALL SELECT submission_timestamp, @@ -70,7 +70,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.firefox_desktop_live.newtab_v1` + `moz-fx-data-shared-prod.firefox_desktop_live.events_v1` UNION ALL SELECT submission_timestamp, @@ -80,7 +80,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.firefox_desktop_live.events_v1` + `moz-fx-data-shared-prod.firefox_desktop_live.urlbar_potential_exposure_v1` ) CROSS JOIN UNNEST(events) AS event, diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/firefox_ios_derived/retention_v1/backfill.yaml /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/firefox_ios_derived/retention_v1/backfill.yaml --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/firefox_ios_derived/retention_v1/backfill.yaml 2024-05-31 16:15:30.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/firefox_ios_derived/retention_v1/backfill.yaml 1970-01-01 00:00:00.000000000 +0000 @@ -1,7 +0,0 @@ -2024-05-31: - start_date: 2021-01-01 - end_date: 2024-05-31 - reason: The table is created, this is to populate it with data. - watchers: - - kik@mozilla.com - status: Initiate diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/google_ads_derived/android_app_campaign_stats_v1/query.sql /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/google_ads_derived/android_app_campaign_stats_v1/query.sql --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/google_ads_derived/android_app_campaign_stats_v1/query.sql 2024-05-31 16:15:30.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/google_ads_derived/android_app_campaign_stats_v1/query.sql 2024-05-31 16:15:55.000000000 +0000 @@ -78,11 +78,7 @@ SELECT date, campaigns_v2.campaign_name AS campaign, - CASE - WHEN LOWER(mozfun.map.get_key(campaigns_v2.campaign_segments, "region")) = "expansion" - THEN "Expansion" - ELSE UPPER(mozfun.map.get_key(campaigns_v2.campaign_segments, "region")) - END AS campaign_region, + UPPER(mozfun.map.get_key(campaigns_v2.campaign_segments, "region")) AS campaign_region, UPPER( mozfun.map.get_key(campaigns_v2.campaign_segments, "country_code") ) AS campaign_country_code, diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/internet_outages/global_outages_v1/schema.yaml /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/internet_outages/global_outages_v1/schema.yaml --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/internet_outages/global_outages_v1/schema.yaml 2024-05-31 16:15:29.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/internet_outages/global_outages_v1/schema.yaml 2024-05-31 16:23:35.000000000 +0000 @@ -1,49 +1,49 @@ fields: -- mode: NULLABLE - name: country +- name: country type: STRING -- mode: NULLABLE - name: city + mode: NULLABLE +- name: city type: STRING -- mode: NULLABLE - name: datetime + mode: NULLABLE +- name: datetime type: TIMESTAMP -- mode: NULLABLE - name: proportion_undefined + mode: NULLABLE +- name: proportion_undefined type: FLOAT -- mode: NULLABLE - name: proportion_timeout + mode: NULLABLE +- name: proportion_timeout type: FLOAT -- mode: NULLABLE - name: proportion_abort + mode: NULLABLE +- name: proportion_abort type: FLOAT -- mode: NULLABLE - name: proportion_unreachable + mode: NULLABLE +- name: proportion_unreachable type: FLOAT -- mode: NULLABLE - name: proportion_terminated + mode: NULLABLE +- name: proportion_terminated type: FLOAT -- mode: NULLABLE - name: proportion_channel_open + mode: NULLABLE +- name: proportion_channel_open type: FLOAT -- mode: NULLABLE - name: avg_dns_success_time + mode: NULLABLE +- name: avg_dns_success_time type: FLOAT -- mode: NULLABLE - name: missing_dns_success + mode: NULLABLE +- name: missing_dns_success type: FLOAT -- mode: NULLABLE - name: avg_dns_failure_time + mode: NULLABLE +- name: avg_dns_failure_time type: FLOAT -- mode: NULLABLE - name: missing_dns_failure + mode: NULLABLE +- name: missing_dns_failure type: FLOAT -- mode: NULLABLE - name: count_dns_failure + mode: NULLABLE +- name: count_dns_failure type: FLOAT -- mode: NULLABLE - name: ssl_error_prop + mode: NULLABLE +- name: ssl_error_prop type: FLOAT -- mode: NULLABLE - name: avg_tls_handshake_time + mode: NULLABLE +- name: avg_tls_handshake_time type: FLOAT + mode: NULLABLE diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/monitoring_derived/event_monitoring_aggregates_v1/query.sql /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/monitoring_derived/event_monitoring_aggregates_v1/query.sql --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/monitoring_derived/event_monitoring_aggregates_v1/query.sql 2024-05-31 16:16:09.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/monitoring_derived/event_monitoring_aggregates_v1/query.sql 2024-05-31 16:19:18.000000000 +0000 @@ -45,7 +45,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.firefox_desktop_stable.prototype_no_code_events_v1` + `moz-fx-data-shared-prod.firefox_desktop_stable.newtab_v1` UNION ALL SELECT submission_timestamp, @@ -55,7 +55,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.firefox_desktop_stable.urlbar_potential_exposure_v1` + `moz-fx-data-shared-prod.firefox_desktop_stable.prototype_no_code_events_v1` UNION ALL SELECT submission_timestamp, @@ -65,7 +65,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.firefox_desktop_stable.newtab_v1` + `moz-fx-data-shared-prod.firefox_desktop_stable.events_v1` UNION ALL SELECT submission_timestamp, @@ -75,7 +75,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.firefox_desktop_stable.events_v1` + `moz-fx-data-shared-prod.firefox_desktop_stable.urlbar_potential_exposure_v1` ) CROSS JOIN UNNEST(events) AS event, @@ -572,7 +572,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.org_mozilla_ios_firefox_stable.metrics_v1` + `moz-fx-data-shared-prod.org_mozilla_ios_firefox_stable.events_v1` UNION ALL SELECT submission_timestamp, @@ -582,7 +582,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.org_mozilla_ios_firefox_stable.first_session_v1` + `moz-fx-data-shared-prod.org_mozilla_ios_firefox_stable.metrics_v1` UNION ALL SELECT submission_timestamp, @@ -592,7 +592,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.org_mozilla_ios_firefox_stable.events_v1` + `moz-fx-data-shared-prod.org_mozilla_ios_firefox_stable.first_session_v1` ) CROSS JOIN UNNEST(events) AS event, @@ -663,7 +663,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.org_mozilla_ios_firefoxbeta_stable.metrics_v1` + `moz-fx-data-shared-prod.org_mozilla_ios_firefoxbeta_stable.events_v1` UNION ALL SELECT submission_timestamp, @@ -673,7 +673,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.org_mozilla_ios_firefoxbeta_stable.first_session_v1` + `moz-fx-data-shared-prod.org_mozilla_ios_firefoxbeta_stable.metrics_v1` UNION ALL SELECT submission_timestamp, @@ -683,7 +683,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.org_mozilla_ios_firefoxbeta_stable.events_v1` + `moz-fx-data-shared-prod.org_mozilla_ios_firefoxbeta_stable.first_session_v1` ) CROSS JOIN UNNEST(events) AS event, @@ -754,7 +754,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.org_mozilla_ios_fennec_stable.metrics_v1` + `moz-fx-data-shared-prod.org_mozilla_ios_fennec_stable.events_v1` UNION ALL SELECT submission_timestamp, @@ -764,7 +764,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.org_mozilla_ios_fennec_stable.first_session_v1` + `moz-fx-data-shared-prod.org_mozilla_ios_fennec_stable.metrics_v1` UNION ALL SELECT submission_timestamp, @@ -774,7 +774,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.org_mozilla_ios_fennec_stable.events_v1` + `moz-fx-data-shared-prod.org_mozilla_ios_fennec_stable.first_session_v1` ) CROSS JOIN UNNEST(events) AS event, @@ -1565,7 +1565,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.mozillavpn_stable.vpnsession_v1` + `moz-fx-data-shared-prod.mozillavpn_stable.daemonsession_v1` UNION ALL SELECT submission_timestamp, @@ -1575,7 +1575,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.mozillavpn_stable.daemonsession_v1` + `moz-fx-data-shared-prod.mozillavpn_stable.vpnsession_v1` ) CROSS JOIN UNNEST(events) AS event, @@ -1656,7 +1656,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.org_mozilla_firefox_vpn_stable.vpnsession_v1` + `moz-fx-data-shared-prod.org_mozilla_firefox_vpn_stable.daemonsession_v1` UNION ALL SELECT submission_timestamp, @@ -1666,7 +1666,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.org_mozilla_firefox_vpn_stable.daemonsession_v1` + `moz-fx-data-shared-prod.org_mozilla_firefox_vpn_stable.vpnsession_v1` ) CROSS JOIN UNNEST(events) AS event, @@ -1747,7 +1747,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.org_mozilla_ios_firefoxvpn_stable.vpnsession_v1` + `moz-fx-data-shared-prod.org_mozilla_ios_firefoxvpn_stable.daemonsession_v1` UNION ALL SELECT submission_timestamp, @@ -1757,7 +1757,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.org_mozilla_ios_firefoxvpn_stable.daemonsession_v1` + `moz-fx-data-shared-prod.org_mozilla_ios_firefoxvpn_stable.vpnsession_v1` ) CROSS JOIN UNNEST(events) AS event, @@ -1838,7 +1838,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.org_mozilla_ios_firefoxvpn_network_extension_stable.vpnsession_v1` + `moz-fx-data-shared-prod.org_mozilla_ios_firefoxvpn_network_extension_stable.daemonsession_v1` UNION ALL SELECT submission_timestamp, @@ -1848,7 +1848,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.org_mozilla_ios_firefoxvpn_network_extension_stable.daemonsession_v1` + `moz-fx-data-shared-prod.org_mozilla_ios_firefoxvpn_network_extension_stable.vpnsession_v1` ) CROSS JOIN UNNEST(events) AS event, @@ -1990,7 +1990,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.bedrock_stable.non_interaction_v1` + `moz-fx-data-shared-prod.bedrock_stable.events_v1` UNION ALL SELECT submission_timestamp, @@ -2010,7 +2010,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.bedrock_stable.events_v1` + `moz-fx-data-shared-prod.bedrock_stable.non_interaction_v1` ) CROSS JOIN UNNEST(events) AS event, @@ -2162,7 +2162,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.firefox_desktop_background_tasks_stable.background_tasks_v1` + `moz-fx-data-shared-prod.firefox_desktop_background_tasks_stable.events_v1` UNION ALL SELECT submission_timestamp, @@ -2172,7 +2172,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.firefox_desktop_background_tasks_stable.events_v1` + `moz-fx-data-shared-prod.firefox_desktop_background_tasks_stable.background_tasks_v1` ) CROSS JOIN UNNEST(events) AS event, diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/mozillavpn_derived/event_monitoring_live_v1/materialized_view.sql /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/mozillavpn_derived/event_monitoring_live_v1/materialized_view.sql --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/mozillavpn_derived/event_monitoring_live_v1/materialized_view.sql 2024-05-31 16:16:10.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/mozillavpn_derived/event_monitoring_live_v1/materialized_view.sql 2024-05-31 16:17:41.000000000 +0000 @@ -60,7 +60,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.mozillavpn_live.vpnsession_v1` + `moz-fx-data-shared-prod.mozillavpn_live.daemonsession_v1` UNION ALL SELECT submission_timestamp, @@ -70,7 +70,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.mozillavpn_live.daemonsession_v1` + `moz-fx-data-shared-prod.mozillavpn_live.vpnsession_v1` ) CROSS JOIN UNNEST(events) AS event, diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/org_mozilla_fenix/geckoview_version/schema.yaml /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/org_mozilla_fenix/geckoview_version/schema.yaml --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/org_mozilla_fenix/geckoview_version/schema.yaml 2024-05-31 16:15:29.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/org_mozilla_fenix/geckoview_version/schema.yaml 2024-05-31 16:24:13.000000000 +0000 @@ -1,7 +1,10 @@ fields: -- type: DATETIME - name: build_hour -- type: INTEGER - name: geckoview_major_version -- type: INTEGER - name: n_pings +- name: build_hour + type: DATETIME + mode: NULLABLE +- name: geckoview_major_version + type: INTEGER + mode: NULLABLE +- name: n_pings + type: INTEGER + mode: NULLABLE diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/org_mozilla_firefox_vpn_derived/event_monitoring_live_v1/materialized_view.sql /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/org_mozilla_firefox_vpn_derived/event_monitoring_live_v1/materialized_view.sql --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/org_mozilla_firefox_vpn_derived/event_monitoring_live_v1/materialized_view.sql 2024-05-31 16:16:10.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/org_mozilla_firefox_vpn_derived/event_monitoring_live_v1/materialized_view.sql 2024-05-31 16:17:42.000000000 +0000 @@ -60,7 +60,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.org_mozilla_firefox_vpn_live.vpnsession_v1` + `moz-fx-data-shared-prod.org_mozilla_firefox_vpn_live.daemonsession_v1` UNION ALL SELECT submission_timestamp, @@ -70,7 +70,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.org_mozilla_firefox_vpn_live.daemonsession_v1` + `moz-fx-data-shared-prod.org_mozilla_firefox_vpn_live.vpnsession_v1` ) CROSS JOIN UNNEST(events) AS event, diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/org_mozilla_ios_fennec_derived/event_monitoring_live_v1/materialized_view.sql /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/org_mozilla_ios_fennec_derived/event_monitoring_live_v1/materialized_view.sql --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/org_mozilla_ios_fennec_derived/event_monitoring_live_v1/materialized_view.sql 2024-05-31 16:16:10.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/org_mozilla_ios_fennec_derived/event_monitoring_live_v1/materialized_view.sql 2024-05-31 16:17:43.000000000 +0000 @@ -50,7 +50,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.org_mozilla_ios_fennec_live.metrics_v1` + `moz-fx-data-shared-prod.org_mozilla_ios_fennec_live.events_v1` UNION ALL SELECT submission_timestamp, @@ -60,7 +60,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.org_mozilla_ios_fennec_live.first_session_v1` + `moz-fx-data-shared-prod.org_mozilla_ios_fennec_live.metrics_v1` UNION ALL SELECT submission_timestamp, @@ -70,7 +70,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.org_mozilla_ios_fennec_live.events_v1` + `moz-fx-data-shared-prod.org_mozilla_ios_fennec_live.first_session_v1` ) CROSS JOIN UNNEST(events) AS event, diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/org_mozilla_ios_firefoxbeta_derived/event_monitoring_live_v1/materialized_view.sql /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/org_mozilla_ios_firefoxbeta_derived/event_monitoring_live_v1/materialized_view.sql --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/org_mozilla_ios_firefoxbeta_derived/event_monitoring_live_v1/materialized_view.sql 2024-05-31 16:16:10.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/org_mozilla_ios_firefoxbeta_derived/event_monitoring_live_v1/materialized_view.sql 2024-05-31 16:17:43.000000000 +0000 @@ -50,7 +50,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.org_mozilla_ios_firefoxbeta_live.metrics_v1` + `moz-fx-data-shared-prod.org_mozilla_ios_firefoxbeta_live.events_v1` UNION ALL SELECT submission_timestamp, @@ -60,7 +60,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.org_mozilla_ios_firefoxbeta_live.first_session_v1` + `moz-fx-data-shared-prod.org_mozilla_ios_firefoxbeta_live.metrics_v1` UNION ALL SELECT submission_timestamp, @@ -70,7 +70,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.org_mozilla_ios_firefoxbeta_live.events_v1` + `moz-fx-data-shared-prod.org_mozilla_ios_firefoxbeta_live.first_session_v1` ) CROSS JOIN UNNEST(events) AS event, diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/org_mozilla_ios_firefox_derived/event_monitoring_live_v1/materialized_view.sql /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/org_mozilla_ios_firefox_derived/event_monitoring_live_v1/materialized_view.sql --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/org_mozilla_ios_firefox_derived/event_monitoring_live_v1/materialized_view.sql 2024-05-31 16:16:10.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/org_mozilla_ios_firefox_derived/event_monitoring_live_v1/materialized_view.sql 2024-05-31 16:17:43.000000000 +0000 @@ -50,7 +50,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.org_mozilla_ios_firefox_live.metrics_v1` + `moz-fx-data-shared-prod.org_mozilla_ios_firefox_live.events_v1` UNION ALL SELECT submission_timestamp, @@ -60,7 +60,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.org_mozilla_ios_firefox_live.first_session_v1` + `moz-fx-data-shared-prod.org_mozilla_ios_firefox_live.metrics_v1` UNION ALL SELECT submission_timestamp, @@ -70,7 +70,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.org_mozilla_ios_firefox_live.events_v1` + `moz-fx-data-shared-prod.org_mozilla_ios_firefox_live.first_session_v1` ) CROSS JOIN UNNEST(events) AS event, diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/org_mozilla_ios_firefoxvpn_derived/event_monitoring_live_v1/materialized_view.sql /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/org_mozilla_ios_firefoxvpn_derived/event_monitoring_live_v1/materialized_view.sql --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/org_mozilla_ios_firefoxvpn_derived/event_monitoring_live_v1/materialized_view.sql 2024-05-31 16:16:10.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/org_mozilla_ios_firefoxvpn_derived/event_monitoring_live_v1/materialized_view.sql 2024-05-31 16:17:44.000000000 +0000 @@ -60,7 +60,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.org_mozilla_ios_firefoxvpn_live.vpnsession_v1` + `moz-fx-data-shared-prod.org_mozilla_ios_firefoxvpn_live.daemonsession_v1` UNION ALL SELECT submission_timestamp, @@ -70,7 +70,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.org_mozilla_ios_firefoxvpn_live.daemonsession_v1` + `moz-fx-data-shared-prod.org_mozilla_ios_firefoxvpn_live.vpnsession_v1` ) CROSS JOIN UNNEST(events) AS event, diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/org_mozilla_ios_firefoxvpn_network_extension_derived/event_monitoring_live_v1/materialized_view.sql /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/org_mozilla_ios_firefoxvpn_network_extension_derived/event_monitoring_live_v1/materialized_view.sql --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/org_mozilla_ios_firefoxvpn_network_extension_derived/event_monitoring_live_v1/materialized_view.sql 2024-05-31 16:16:10.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/org_mozilla_ios_firefoxvpn_network_extension_derived/event_monitoring_live_v1/materialized_view.sql 2024-05-31 16:17:43.000000000 +0000 @@ -60,7 +60,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.org_mozilla_ios_firefoxvpn_network_extension_live.vpnsession_v1` + `moz-fx-data-shared-prod.org_mozilla_ios_firefoxvpn_network_extension_live.daemonsession_v1` UNION ALL SELECT submission_timestamp, @@ -70,7 +70,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.org_mozilla_ios_firefoxvpn_network_extension_live.daemonsession_v1` + `moz-fx-data-shared-prod.org_mozilla_ios_firefoxvpn_network_extension_live.vpnsession_v1` ) CROSS JOIN UNNEST(events) AS event, diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/search_derived/mobile_search_clients_daily_v1/query.sql /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/search_derived/mobile_search_clients_daily_v1/query.sql --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/search_derived/mobile_search_clients_daily_v1/query.sql 2024-05-31 16:15:29.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/search_derived/mobile_search_clients_daily_v1/query.sql 2024-05-31 16:17:40.000000000 +0000 @@ -1,6 +1,4 @@ --- Query generated by ./bqetl generate search --- This file doesn't get overwritten by the generator. The generator output needs --- to be written to this file manually. +-- Query generated by bigquery-etl/search/mobile_search_clients_daily.py -- -- Older versions separate source and engine with an underscore instead of period -- Return array of form [source, engine] if key is valid, empty array otherwise @@ -498,7 +496,7 @@ metrics.counter.browser_total_uri_count, client_info.locale, FROM - org_mozilla_ios_klar.metrics AS org_mozilla_klar_metrics + org_mozilla_ios_klar.metrics AS org_mozilla_ios_klar_metrics ), fenix_baseline AS ( SELECT @@ -722,12 +720,6 @@ SUBSTR(search.key, STRPOS(search.key, '.') + 1), search.search_type ) - WHEN search.search_type = 'search-with-ads' - THEN IF( - REGEXP_CONTAINS(search.key, '\\.'), - SUBSTR(search.key, STRPOS(search.key, '.') + 1), - search.search_type - ) ELSE search.search_type END AS source, search.value AS search_count, @@ -780,8 +772,6 @@ CASE WHEN search_type = 'ad-click' THEN IF(STARTS_WITH(source, 'in-content.organic'), 'ad-click-organic', search_type) - WHEN search_type = 'search-with-ads' - THEN IF(STARTS_WITH(source, 'in-content.organic'), 'search-with-ads-organic', search_type) WHEN STARTS_WITH(source, 'in-content.sap.') THEN 'tagged-sap' WHEN REGEXP_CONTAINS(source, '^in-content.*-follow-on') @@ -864,15 +854,6 @@ ) ) AS search_with_ads, SUM( - IF( - search_type != 'search-with-ads-organic' - OR engine IS NULL - OR search_count > 10000, - 0, - search_count - ) - ) AS search_with_ads_organic, - SUM( IF(search_type != 'unknown' OR engine IS NULL OR search_count > 10000, 0, search_count) ) AS unknown, udf.mode_last(ARRAY_AGG(country)) AS country, @@ -891,7 +872,6 @@ ANY_VALUE(sample_id) AS sample_id, udf.map_mode_last(ARRAY_CONCAT_AGG(experiments)) AS experiments, SUM(total_uri_count) AS total_uri_count, - CAST(NULL AS STRING) AS normalized_engine FROM combined_search_clients WHERE diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/static/ca_postal_districts_v1/schema.yaml /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/static/ca_postal_districts_v1/schema.yaml --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/static/ca_postal_districts_v1/schema.yaml 2024-05-31 16:15:29.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/static/ca_postal_districts_v1/schema.yaml 2024-05-31 16:24:48.000000000 +0000 @@ -1,9 +1,7 @@ fields: - name: postal_district_code type: STRING - mode: REQUIRED - description: One-character Canadian postal district code. + mode: NULLABLE - name: province_code type: STRING mode: NULLABLE - description: Two-character Canadian province/territory code (if any). diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/static/country_codes_v1/schema.yaml /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/static/country_codes_v1/schema.yaml --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/static/country_codes_v1/schema.yaml 2024-05-31 16:15:29.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/static/country_codes_v1/schema.yaml 2024-05-31 16:24:48.000000000 +0000 @@ -1,47 +1,28 @@ fields: - name: name - description: Official country name per ISO 3166 type: STRING - mode: REQUIRED + mode: NULLABLE - name: code - description: ISO 3166 alpha-2 country code type: STRING - mode: REQUIRED + mode: NULLABLE - name: code_3 - description: ISO 3166 alpha-3 country code type: STRING - mode: REQUIRED + mode: NULLABLE - name: region_name - description: Region name. These are based on the UN Statistics Division standard - country or area codes for statistical use (M49), but with the "Americas" region - split into "North America" and "South America". type: STRING - mode: REQUIRED + mode: NULLABLE - name: subregion_name - description: Sub-region name. These are based on UN Statistics Division standard - country or area codes for statistical use (M49), but with the "Latin America and the - Caribbean" and "Sub-Saharan Africa" sub-regions split into more specific - sub-regions. type: STRING - mode: REQUIRED + mode: NULLABLE - name: pocket_available_on_newtab - description: Whether Pocket is available on the newtab page in this country. Note - that Pocket might only be available in certain locales/languages within a country. - type: BOOL - mode: REQUIRED + type: BOOLEAN + mode: NULLABLE - name: mozilla_vpn_available - description: Whether Mozilla VPN is available in this country. - type: BOOL - mode: REQUIRED + type: BOOLEAN + mode: NULLABLE - name: sponsored_tiles_available_on_newtab - description: Whether sponsored tiles are available on the newtab page in this country. - Note that Pocket might only be available in certain locales/languages within a - country. - type: BOOL - mode: REQUIRED + type: BOOLEAN + mode: NULLABLE - name: ads_value_tier - description: Lowercase label detailing the monetary value tier that Mozilla Ads - assign to that region based on market size and our existing products, e.g., tier - 1, tier 2, etc. type: STRING - mode: REQUIRED + mode: NULLABLE diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/static/country_names_v1/schema.yaml /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/static/country_names_v1/schema.yaml --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/static/country_names_v1/schema.yaml 2024-05-31 16:15:29.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/static/country_names_v1/schema.yaml 2024-05-31 16:24:48.000000000 +0000 @@ -1,10 +1,7 @@ fields: - name: name - description: An alias for a country's name (including misspellings and alternate - encodings). type: STRING - mode: REQUIRED + mode: NULLABLE - name: code - description: ISO 3166 alpha-2 country code type: STRING - mode: REQUIRED + mode: NULLABLE diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/static/data_incidents_v1/schema.yaml /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/static/data_incidents_v1/schema.yaml --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/static/data_incidents_v1/schema.yaml 2024-05-31 16:15:29.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/static/data_incidents_v1/schema.yaml 2024-05-31 16:24:48.000000000 +0000 @@ -1,22 +1,22 @@ fields: -- mode: NULLABLE - name: start_date +- name: start_date type: DATE -- mode: NULLABLE - name: end_date + mode: NULLABLE +- name: end_date type: DATE -- mode: NULLABLE - name: incident + mode: NULLABLE +- name: incident type: STRING -- mode: NULLABLE - name: description + mode: NULLABLE +- name: description type: STRING -- mode: NULLABLE - name: bug + mode: NULLABLE +- name: bug type: STRING -- mode: NULLABLE - name: product + mode: NULLABLE +- name: product type: STRING -- mode: NULLABLE - name: version + mode: NULLABLE +- name: version type: STRING + mode: NULLABLE diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/static/iana_tls_cipher_suites/schema.yaml /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/static/iana_tls_cipher_suites/schema.yaml --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/static/iana_tls_cipher_suites/schema.yaml 2024-05-31 16:15:29.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/static/iana_tls_cipher_suites/schema.yaml 2024-05-31 16:24:48.000000000 +0000 @@ -1,27 +1,16 @@ fields: -- mode: NULLABLE - description: Hex value assigned to the TLS cipher, in format like "0x00,0x84"; note - some values are ranges or contain wildcards - name: value +- name: value type: STRING -- mode: NULLABLE - description: Human-readable name of the TLS cipher - name: description + mode: NULLABLE +- name: description type: STRING -- mode: NULLABLE - description: Any TLS cipher suite that is specified for use with DTLS MUST define - limits on the use of the associated AEAD function that preserves margins for both - confidentiality and integrity, as specified in [RFC-ietf-tls-dtls13-43] - name: dtls_ok + mode: NULLABLE +- name: dtls_ok type: BOOLEAN -- mode: NULLABLE - description: Whether the TLS cipher is recommended by the IETF. If an item is not - marked as "recommended", it does not necessarily mean that it is flawed; rather, - it indicates that the item either has not been through the IETF consensus process, - has limited applicability, or is intended only for specific use cases - name: recommended + mode: NULLABLE +- name: recommended type: BOOLEAN -- mode: NULLABLE - description: RFCs or associated reference material for the TLS cipher - name: reference + mode: NULLABLE +- name: reference type: STRING + mode: NULLABLE diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/static/language_codes_v1/schema.yaml /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/static/language_codes_v1/schema.yaml --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/static/language_codes_v1/schema.yaml 2024-05-31 16:15:29.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/static/language_codes_v1/schema.yaml 2024-05-31 16:24:48.000000000 +0000 @@ -1,17 +1,13 @@ fields: - name: code_3 - description: ISO 639 alpha-3 language code. type: STRING - mode: REQUIRED + mode: NULLABLE - name: code_2 - description: ISO 639 alpha-2 language code (if any). type: STRING mode: NULLABLE - name: name - description: Language name. type: STRING - mode: REQUIRED + mode: NULLABLE - name: other_names - description: Other names for the language (if any). type: STRING mode: NULLABLE diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/static/monitoring_distinct_docids_notes_v1/schema.yaml /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/static/monitoring_distinct_docids_notes_v1/schema.yaml --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/static/monitoring_distinct_docids_notes_v1/schema.yaml 2024-05-31 16:15:29.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/static/monitoring_distinct_docids_notes_v1/schema.yaml 2024-05-31 16:24:48.000000000 +0000 @@ -1,19 +1,19 @@ fields: -- mode: NULLABLE - name: start_date +- name: start_date type: DATE -- mode: NULLABLE - name: end_date + mode: NULLABLE +- name: end_date type: DATE -- mode: NULLABLE - name: document_namespace + mode: NULLABLE +- name: document_namespace type: STRING -- mode: NULLABLE - name: document_type + mode: NULLABLE +- name: document_type type: STRING -- mode: NULLABLE - name: notes + mode: NULLABLE +- name: notes type: STRING -- mode: NULLABLE - name: bug + mode: NULLABLE +- name: bug type: STRING + mode: NULLABLE diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/static/monitoring_missing_columns_notes_v1/schema.yaml /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/static/monitoring_missing_columns_notes_v1/schema.yaml --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/static/monitoring_missing_columns_notes_v1/schema.yaml 2024-05-31 16:15:29.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/static/monitoring_missing_columns_notes_v1/schema.yaml 2024-05-31 16:24:48.000000000 +0000 @@ -1,25 +1,25 @@ fields: -- mode: NULLABLE - name: start_date +- name: start_date type: DATE -- mode: NULLABLE - name: end_date + mode: NULLABLE +- name: end_date type: DATE -- mode: NULLABLE - name: document_namespace + mode: NULLABLE +- name: document_namespace type: STRING -- mode: NULLABLE - name: document_type + mode: NULLABLE +- name: document_type type: STRING -- mode: NULLABLE - name: document_version + mode: NULLABLE +- name: document_version type: STRING -- mode: NULLABLE - name: path + mode: NULLABLE +- name: path type: STRING -- mode: NULLABLE - name: notes + mode: NULLABLE +- name: notes type: STRING -- mode: NULLABLE - name: bug + mode: NULLABLE +- name: bug type: STRING + mode: NULLABLE diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/static/monitoring_missing_document_namespaces_notes_v1/schema.yaml /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/static/monitoring_missing_document_namespaces_notes_v1/schema.yaml --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/static/monitoring_missing_document_namespaces_notes_v1/schema.yaml 2024-05-31 16:15:29.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/static/monitoring_missing_document_namespaces_notes_v1/schema.yaml 2024-05-31 16:24:48.000000000 +0000 @@ -1,22 +1,22 @@ fields: -- mode: NULLABLE - name: start_date +- name: start_date type: DATE -- mode: NULLABLE - name: end_date + mode: NULLABLE +- name: end_date type: DATE -- mode: NULLABLE - name: document_namespace + mode: NULLABLE +- name: document_namespace type: STRING -- mode: NULLABLE - name: document_type + mode: NULLABLE +- name: document_type type: STRING -- mode: NULLABLE - name: document_version + mode: NULLABLE +- name: document_version type: STRING -- mode: NULLABLE - name: notes + mode: NULLABLE +- name: notes type: STRING -- mode: NULLABLE - name: bug + mode: NULLABLE +- name: bug type: STRING + mode: NULLABLE diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/static/monitoring_schema_errors_notes_v1/schema.yaml /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/static/monitoring_schema_errors_notes_v1/schema.yaml --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/static/monitoring_schema_errors_notes_v1/schema.yaml 2024-05-31 16:15:29.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/static/monitoring_schema_errors_notes_v1/schema.yaml 2024-05-31 16:24:48.000000000 +0000 @@ -1,22 +1,22 @@ fields: -- mode: NULLABLE - name: start_date +- name: start_date type: DATE -- mode: NULLABLE - name: end_date + mode: NULLABLE +- name: end_date type: DATE -- mode: NULLABLE - name: document_namespace + mode: NULLABLE +- name: document_namespace type: STRING -- mode: NULLABLE - name: document_type + mode: NULLABLE +- name: document_type type: STRING -- mode: NULLABLE - name: path + mode: NULLABLE +- name: path type: STRING -- mode: NULLABLE - name: notes + mode: NULLABLE +- name: notes type: STRING -- mode: NULLABLE - name: bug + mode: NULLABLE +- name: bug type: STRING + mode: NULLABLE diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/static/normal_distribution/schema.yaml /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/static/normal_distribution/schema.yaml --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/static/normal_distribution/schema.yaml 2024-05-31 16:15:29.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/static/normal_distribution/schema.yaml 2024-05-31 16:24:48.000000000 +0000 @@ -1,7 +1,7 @@ fields: - name: score type: NUMERIC - mode: REQUIRED + mode: NULLABLE - name: value type: NUMERIC - mode: REQUIRED + mode: NULLABLE diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/static/us_zip_code_prefixes_v1/schema.yaml /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/static/us_zip_code_prefixes_v1/schema.yaml --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/static/us_zip_code_prefixes_v1/schema.yaml 2024-05-31 16:15:29.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/static/us_zip_code_prefixes_v1/schema.yaml 2024-05-31 16:24:48.000000000 +0000 @@ -1,9 +1,7 @@ fields: - name: zip_code_prefix type: STRING - mode: REQUIRED - description: Three-digit US ZIP code prefix. + mode: NULLABLE - name: state_code type: STRING mode: NULLABLE - description: Two-character US state/territory code (if any). diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/telemetry_derived/desktop_engagement_clients_v1/backfill.yaml /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/telemetry_derived/desktop_engagement_clients_v1/backfill.yaml --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/telemetry_derived/desktop_engagement_clients_v1/backfill.yaml 2024-05-31 16:15:30.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/telemetry_derived/desktop_engagement_clients_v1/backfill.yaml 2024-05-31 16:15:55.000000000 +0000 @@ -4,4 +4,4 @@ reason: Loading new table; related to DENG-3186 watchers: - kwindau@mozilla.com - status: Complete + status: Initiate ```

Link to full diff

edugfilho commented 1 month ago

Good to know that the other ones aren't too hard to convert. Looks like the sql tests are failing at least partially due to the ones I wrote. Have you tested these on the real data to see if the output looks ok and there are improvements? Getting them wrong and needing a backfill could be expensive and the unit tests aren't the most comprehensive

I haven't done extensive test. I'll test it, fix the code and add more tests.

dataops-ci-bot commented 1 month ago

Integration report for "fix percentile udf"

sql.diff

Click to expand! ```diff diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/dags/bqetl_analytics_aggregations.py /tmp/workspace/generated-sql/dags/bqetl_analytics_aggregations.py --- /tmp/workspace/main-generated-sql/dags/bqetl_analytics_aggregations.py 2024-06-06 15:53:37.000000000 +0000 +++ /tmp/workspace/generated-sql/dags/bqetl_analytics_aggregations.py 2024-06-06 16:07:50.000000000 +0000 @@ -322,30 +322,6 @@ pool="DATA_ENG_EXTERNALTASKSENSOR", ) - wait_for_checks__fail_org_mozilla_klar_derived__baseline_clients_last_seen__v1 = ExternalTaskSensor( - task_id="wait_for_checks__fail_org_mozilla_klar_derived__baseline_clients_last_seen__v1", - external_dag_id="bqetl_glean_usage", - external_task_id="klar_android.checks__fail_org_mozilla_klar_derived__baseline_clients_last_seen__v1", - execution_delta=datetime.timedelta(seconds=8100), - check_existence=True, - mode="reschedule", - allowed_states=ALLOWED_STATES, - failed_states=FAILED_STATES, - pool="DATA_ENG_EXTERNALTASKSENSOR", - ) - - wait_for_klar_android_derived__metrics_clients_last_seen__v1 = ExternalTaskSensor( - task_id="wait_for_klar_android_derived__metrics_clients_last_seen__v1", - external_dag_id="bqetl_glean_usage", - external_task_id="klar_android.klar_android_derived__metrics_clients_last_seen__v1", - execution_delta=datetime.timedelta(seconds=8100), - check_existence=True, - mode="reschedule", - allowed_states=ALLOWED_STATES, - failed_states=FAILED_STATES, - pool="DATA_ENG_EXTERNALTASKSENSOR", - ) - wait_for_checks__fail_org_mozilla_ios_klar_derived__baseline_clients_last_seen__v1 = ExternalTaskSensor( task_id="wait_for_checks__fail_org_mozilla_ios_klar_derived__baseline_clients_last_seen__v1", external_dag_id="bqetl_glean_usage", @@ -562,37 +538,6 @@ checks__fail_focus_ios_derived__active_users_aggregates__v3 ) - checks__fail_klar_android_derived__active_users_aggregates__v3 = bigquery_dq_check( - task_id="checks__fail_klar_android_derived__active_users_aggregates__v3", - source_table='active_users_aggregates_v3${{ macros.ds_format(macros.ds_add(ds, -1), "%Y-%m-%d", "%Y%m%d") }}', - dataset_id="klar_android_derived", - project_id="moz-fx-data-shared-prod", - is_dq_check_fail=True, - owner="lvargas@mozilla.com", - email=[ - "gkaberere@mozilla.com", - "lvargas@mozilla.com", - "telemetry-alerts@mozilla.com", - ], - depends_on_past=False, - parameters=["submission_date:DATE:{{macros.ds_add(ds, -1)}}"], - retries=0, - ) - - with TaskGroup( - "checks__fail_klar_android_derived__active_users_aggregates__v3_external", - ) as checks__fail_klar_android_derived__active_users_aggregates__v3_external: - ExternalTaskMarker( - task_id="bqetl_search_dashboard__wait_for_checks__fail_klar_android_derived__active_users_aggregates__v3", - external_dag_id="bqetl_search_dashboard", - external_task_id="wait_for_checks__fail_klar_android_derived__active_users_aggregates__v3", - execution_date="{{ (execution_date - macros.timedelta(days=-1, seconds=85500)).isoformat() }}", - ) - - checks__fail_klar_android_derived__active_users_aggregates__v3_external.set_upstream( - checks__fail_klar_android_derived__active_users_aggregates__v3 - ) - checks__fail_klar_ios_derived__active_users_aggregates__v3 = bigquery_dq_check( task_id="checks__fail_klar_ios_derived__active_users_aggregates__v3", source_table='active_users_aggregates_v3${{ macros.ds_format(macros.ds_add(ds, -1), "%Y-%m-%d", "%Y%m%d") }}', @@ -709,23 +654,6 @@ retries=0, ) - checks__warn_klar_android_derived__active_users_aggregates__v3 = bigquery_dq_check( - task_id="checks__warn_klar_android_derived__active_users_aggregates__v3", - source_table='active_users_aggregates_v3${{ macros.ds_format(macros.ds_add(ds, -1), "%Y-%m-%d", "%Y%m%d") }}', - dataset_id="klar_android_derived", - project_id="moz-fx-data-shared-prod", - is_dq_check_fail=False, - owner="lvargas@mozilla.com", - email=[ - "gkaberere@mozilla.com", - "lvargas@mozilla.com", - "telemetry-alerts@mozilla.com", - ], - depends_on_past=False, - parameters=["submission_date:DATE:{{macros.ds_add(ds, -1)}}"], - retries=0, - ) - checks__warn_klar_ios_derived__active_users_aggregates__v3 = bigquery_dq_check( task_id="checks__warn_klar_ios_derived__active_users_aggregates__v3", source_table='active_users_aggregates_v3${{ macros.ds_format(macros.ds_add(ds, -1), "%Y-%m-%d", "%Y%m%d") }}', @@ -837,22 +765,6 @@ parameters=["submission_date:DATE:{{macros.ds_add(ds, -1)}}"], ) - klar_android_active_users_aggregates = bigquery_etl_query( - task_id="klar_android_active_users_aggregates", - destination_table='active_users_aggregates_v3${{ macros.ds_format(macros.ds_add(ds, -1), "%Y-%m-%d", "%Y%m%d") }}', - dataset_id="klar_android_derived", - project_id="moz-fx-data-shared-prod", - owner="lvargas@mozilla.com", - email=[ - "gkaberere@mozilla.com", - "lvargas@mozilla.com", - "telemetry-alerts@mozilla.com", - ], - date_partition_parameter=None, - depends_on_past=False, - parameters=["submission_date:DATE:{{macros.ds_add(ds, -1)}}"], - ) - klar_ios_active_users_aggregates = bigquery_etl_query( task_id="klar_ios_active_users_aggregates", destination_table='active_users_aggregates_v3${{ macros.ds_format(macros.ds_add(ds, -1), "%Y-%m-%d", "%Y%m%d") }}', @@ -945,10 +857,6 @@ focus_ios_active_users_aggregates ) - checks__fail_klar_android_derived__active_users_aggregates__v3.set_upstream( - klar_android_active_users_aggregates - ) - checks__fail_klar_ios_derived__active_users_aggregates__v3.set_upstream( klar_ios_active_users_aggregates ) @@ -997,10 +905,6 @@ focus_ios_active_users_aggregates ) - checks__warn_klar_android_derived__active_users_aggregates__v3.set_upstream( - klar_android_active_users_aggregates - ) - checks__warn_klar_ios_derived__active_users_aggregates__v3.set_upstream( klar_ios_active_users_aggregates ) @@ -1089,14 +993,6 @@ wait_for_focus_ios_derived__metrics_clients_last_seen__v1 ) - klar_android_active_users_aggregates.set_upstream( - wait_for_checks__fail_org_mozilla_klar_derived__baseline_clients_last_seen__v1 - ) - - klar_android_active_users_aggregates.set_upstream( - wait_for_klar_android_derived__metrics_clients_last_seen__v1 - ) - klar_ios_active_users_aggregates.set_upstream( wait_for_checks__fail_org_mozilla_ios_klar_derived__baseline_clients_last_seen__v1 ) diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/dags/bqetl_desktop_conv_evnt_categorization.py /tmp/workspace/generated-sql/dags/bqetl_desktop_conv_evnt_categorization.py --- /tmp/workspace/main-generated-sql/dags/bqetl_desktop_conv_evnt_categorization.py 2024-06-06 15:53:37.000000000 +0000 +++ /tmp/workspace/generated-sql/dags/bqetl_desktop_conv_evnt_categorization.py 2024-06-06 16:07:52.000000000 +0000 @@ -76,18 +76,6 @@ pool="DATA_ENG_EXTERNALTASKSENSOR", ) - wait_for_telemetry_derived__clients_first_seen__v1 = ExternalTaskSensor( - task_id="wait_for_telemetry_derived__clients_first_seen__v1", - external_dag_id="bqetl_main_summary", - external_task_id="telemetry_derived__clients_first_seen__v1", - execution_delta=datetime.timedelta(seconds=36000), - check_existence=True, - mode="reschedule", - allowed_states=ALLOWED_STATES, - failed_states=FAILED_STATES, - pool="DATA_ENG_EXTERNALTASKSENSOR", - ) - checks__warn_google_ads_derived__conversion_event_categorization__v1 = bigquery_dq_check( task_id="checks__warn_google_ads_derived__conversion_event_categorization__v1", source_table='conversion_event_categorization_v1${{ macros.ds_format(macros.ds_add(ds, -14), "%Y-%m-%d", "%Y%m%d") }}', @@ -126,7 +114,3 @@ google_ads_derived__conversion_event_categorization__v1.set_upstream( wait_for_checks__fail_telemetry_derived__clients_last_seen__v2 ) - - google_ads_derived__conversion_event_categorization__v1.set_upstream( - wait_for_telemetry_derived__clients_first_seen__v1 - ) diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/dags/bqetl_glean_usage.py /tmp/workspace/generated-sql/dags/bqetl_glean_usage.py --- /tmp/workspace/main-generated-sql/dags/bqetl_glean_usage.py 2024-06-06 15:53:37.000000000 +0000 +++ /tmp/workspace/generated-sql/dags/bqetl_glean_usage.py 2024-06-06 16:07:54.000000000 +0000 @@ -1191,13 +1191,6 @@ parent_group=task_group_klar_android, ) as checks__fail_org_mozilla_klar_derived__baseline_clients_last_seen__v1_external: ExternalTaskMarker( - task_id="bqetl_analytics_aggregations__wait_for_checks__fail_org_mozilla_klar_derived__baseline_clients_last_seen__v1", - external_dag_id="bqetl_analytics_aggregations", - external_task_id="wait_for_checks__fail_org_mozilla_klar_derived__baseline_clients_last_seen__v1", - execution_date="{{ (execution_date - macros.timedelta(days=-1, seconds=78300)).isoformat() }}", - ) - - ExternalTaskMarker( task_id="bqetl_mobile_kpi_metrics__wait_for_checks__fail_org_mozilla_klar_derived__baseline_clients_last_seen__v1", external_dag_id="bqetl_mobile_kpi_metrics", external_task_id="wait_for_checks__fail_org_mozilla_klar_derived__baseline_clients_last_seen__v1", @@ -2504,21 +2497,6 @@ task_group=task_group_klar_android, ) - with TaskGroup( - "klar_android_derived__metrics_clients_last_seen__v1_external", - parent_group=task_group_klar_android, - ) as klar_android_derived__metrics_clients_last_seen__v1_external: - ExternalTaskMarker( - task_id="bqetl_analytics_aggregations__wait_for_klar_android_derived__metrics_clients_last_seen__v1", - external_dag_id="bqetl_analytics_aggregations", - external_task_id="wait_for_klar_android_derived__metrics_clients_last_seen__v1", - execution_date="{{ (execution_date - macros.timedelta(days=-1, seconds=78300)).isoformat() }}", - ) - - klar_android_derived__metrics_clients_last_seen__v1_external.set_upstream( - klar_android_derived__metrics_clients_last_seen__v1 - ) - klar_ios_derived__clients_last_seen_joined__v1 = bigquery_etl_query( task_id="klar_ios_derived__clients_last_seen_joined__v1", destination_table="clients_last_seen_joined_v1", diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/dags/bqetl_main_summary.py /tmp/workspace/generated-sql/dags/bqetl_main_summary.py --- /tmp/workspace/main-generated-sql/dags/bqetl_main_summary.py 2024-06-06 15:53:37.000000000 +0000 +++ /tmp/workspace/generated-sql/dags/bqetl_main_summary.py 2024-06-06 16:07:48.000000000 +0000 @@ -498,20 +498,6 @@ priority_weight=80, ) - with TaskGroup( - "telemetry_derived__clients_first_seen__v1_external", - ) as telemetry_derived__clients_first_seen__v1_external: - ExternalTaskMarker( - task_id="bqetl_desktop_conv_evnt_categorization__wait_for_telemetry_derived__clients_first_seen__v1", - external_dag_id="bqetl_desktop_conv_evnt_categorization", - external_task_id="wait_for_telemetry_derived__clients_first_seen__v1", - execution_date="{{ (execution_date - macros.timedelta(days=-1, seconds=50400)).isoformat() }}", - ) - - telemetry_derived__clients_first_seen__v1_external.set_upstream( - telemetry_derived__clients_first_seen__v1 - ) - telemetry_derived__clients_last_seen__v1 = bigquery_etl_query( task_id="telemetry_derived__clients_last_seen__v1", destination_table="clients_last_seen_v1", diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/dags/bqetl_search_dashboard.py /tmp/workspace/generated-sql/dags/bqetl_search_dashboard.py --- /tmp/workspace/main-generated-sql/dags/bqetl_search_dashboard.py 2024-06-06 15:53:37.000000000 +0000 +++ /tmp/workspace/generated-sql/dags/bqetl_search_dashboard.py 2024-06-06 16:07:48.000000000 +0000 @@ -121,18 +121,6 @@ pool="DATA_ENG_EXTERNALTASKSENSOR", ) - wait_for_checks__fail_klar_android_derived__active_users_aggregates__v3 = ExternalTaskSensor( - task_id="wait_for_checks__fail_klar_android_derived__active_users_aggregates__v3", - external_dag_id="bqetl_analytics_aggregations", - external_task_id="checks__fail_klar_android_derived__active_users_aggregates__v3", - execution_delta=datetime.timedelta(seconds=900), - check_existence=True, - mode="reschedule", - allowed_states=ALLOWED_STATES, - failed_states=FAILED_STATES, - pool="DATA_ENG_EXTERNALTASKSENSOR", - ) - wait_for_checks__fail_klar_ios_derived__active_users_aggregates__v3 = ExternalTaskSensor( task_id="wait_for_checks__fail_klar_ios_derived__active_users_aggregates__v3", external_dag_id="bqetl_analytics_aggregations", @@ -259,10 +247,6 @@ ) search_derived__search_revenue_levers_daily__v1.set_upstream( - wait_for_checks__fail_klar_android_derived__active_users_aggregates__v3 - ) - - search_derived__search_revenue_levers_daily__v1.set_upstream( wait_for_checks__fail_klar_ios_derived__active_users_aggregates__v3 ) Only in /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/klar_android: active_users_aggregates Only in /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/klar_android_derived: active_users_aggregates_v3 diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/mozfun/glam/histogram_cast_json/udf.sql /tmp/workspace/generated-sql/sql/mozfun/glam/histogram_cast_json/udf.sql --- /tmp/workspace/main-generated-sql/sql/mozfun/glam/histogram_cast_json/udf.sql 2024-06-06 15:49:20.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/mozfun/glam/histogram_cast_json/udf.sql 2024-06-06 15:50:10.000000000 +0000 @@ -2,21 +2,24 @@ CREATE OR REPLACE FUNCTION glam.histogram_cast_json( histogram ARRAY> ) -RETURNS STRING DETERMINISTIC -LANGUAGE js -AS - ''' - let obj = {}; - histogram.map(r => { - obj[r.key] = parseFloat(r.value.toFixed(4)); - }); - return JSON.stringify(obj); -'''; +RETURNS STRING AS ( + IF( + ARRAY_LENGTH(histogram) = 0, + "{}", + TO_JSON_STRING( + JSON_OBJECT( + ARRAY(SELECT key FROM UNNEST(histogram)), + ARRAY(SELECT ROUND(value, 4) FROM UNNEST(histogram)) + ) + ) + ) +); SELECT assert.equals( - '{"0":0.1111,"1":0.6667,"2":0}', + '{"0":0.1111,"1":0.6667,"2":0.0}', glam.histogram_cast_json( ARRAY>[("0", 0.111111), ("1", 2.0 / 3), ("2", 0)] ) - ) + ), + assert.equals('{}', glam.histogram_cast_json(ARRAY>[])), diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/mozfun/glam/histogram_generate_functional_buckets/udf.sql /tmp/workspace/generated-sql/sql/mozfun/glam/histogram_generate_functional_buckets/udf.sql --- /tmp/workspace/main-generated-sql/sql/mozfun/glam/histogram_generate_functional_buckets/udf.sql 2024-06-06 15:49:20.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/mozfun/glam/histogram_generate_functional_buckets/udf.sql 2024-06-06 15:50:10.000000000 +0000 @@ -4,34 +4,29 @@ buckets_per_magnitude INT64, range_max INT64 ) -RETURNS ARRAY DETERMINISTIC -LANGUAGE js -AS - ''' - function sample_to_bucket_index(sample) { - // Get the index of the sample - // https://github.com/mozilla/glean/blob/main/glean-core/src/histogram/functional.rs - let exponent = Math.pow(log_base, 1.0/buckets_per_magnitude); - return Math.ceil(Math.log(sample + 1) / Math.log(exponent)); - } - - let buckets = new Set([0]); - for (let index = 0; index < sample_to_bucket_index(range_max); index++) { - - // Avoid re-using the exponent due to floating point issues when carrying - // the `pow` operation e.g. `let exponent = ...; Math.pow(exponent, index)`. - let bucket = Math.floor(Math.pow(log_base, index/buckets_per_magnitude)); - - // NOTE: the sample_to_bucket_index implementation overshoots the true index, - // so we break out early if we hit the max bucket range. - if (bucket > range_max) { - break; - } - buckets.add(bucket); - } - - return [...buckets] -'''; +RETURNS ARRAY AS ( + ( + WITH bucket_indexes AS ( + -- Generate all bucket indexes + -- https://github.com/mozilla/glean/blob/main/glean-core/src/histogram/functional.rs + SELECT + GENERATE_ARRAY(0, CEIL(LOG(range_max + 1, log_base) * buckets_per_magnitude)) AS indexes + ), + buckets AS ( + SELECT + FLOOR(POW(log_base, (idx) / buckets_per_magnitude)) AS bucket + FROM + bucket_indexes, + UNNEST(indexes) AS idx + WHERE + FLOOR(POW(log_base, (idx) / buckets_per_magnitude)) <= range_max + ) + SELECT + ARRAY_CONCAT([0.0], ARRAY_AGG(DISTINCT(bucket) ORDER BY bucket)) + FROM + buckets + ) +); SELECT -- First 50 keys of a timing distribution diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/mozfun/glam/histogram_generate_linear_buckets/udf.sql /tmp/workspace/generated-sql/sql/mozfun/glam/histogram_generate_linear_buckets/udf.sql --- /tmp/workspace/main-generated-sql/sql/mozfun/glam/histogram_generate_linear_buckets/udf.sql 2024-06-06 15:49:20.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/mozfun/glam/histogram_generate_linear_buckets/udf.sql 2024-06-06 15:50:10.000000000 +0000 @@ -4,17 +4,17 @@ max FLOAT64, nBuckets FLOAT64 ) -RETURNS ARRAY DETERMINISTIC -LANGUAGE js -AS - ''' - let result = [0]; - for (let i = 1; i < Math.min(nBuckets, max, 10000); i++) { - let linearRange = (min * (nBuckets - 1 - i) + max * (i - 1)) / (nBuckets - 2); - result.push(Math.round(linearRange)); - } - return result; -'''; +RETURNS ARRAY AS ( + ARRAY_CONCAT( + [0.0], + ARRAY( + SELECT + ROUND((min * (nBuckets - 1 - i) + max * (i - 1)) / (nBuckets - 2)) + FROM + UNNEST(GENERATE_ARRAY(1, LEAST(nBuckets - 1, max, 10000))) AS i + ) + ) +); SELECT -- Buckets of CONTENT_FRAME_TIME_VSYNC diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/mozfun/glam/histogram_generate_scalar_buckets/udf.sql /tmp/workspace/generated-sql/sql/mozfun/glam/histogram_generate_scalar_buckets/udf.sql --- /tmp/workspace/main-generated-sql/sql/mozfun/glam/histogram_generate_scalar_buckets/udf.sql 2024-06-06 15:49:20.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/mozfun/glam/histogram_generate_scalar_buckets/udf.sql 2024-06-06 15:50:10.000000000 +0000 @@ -4,17 +4,14 @@ max_bucket FLOAT64, num_buckets INT64 ) -RETURNS ARRAY DETERMINISTIC -LANGUAGE js -AS - ''' - let bucket_size = (max_bucket - min_bucket) / num_buckets; - let buckets = new Set(); - for (let bucket = min_bucket; bucket < max_bucket; bucket += bucket_size) { - buckets.add(Math.pow(2, bucket).toFixed(2)); - } - return Array.from(buckets); -'''; +RETURNS ARRAY AS ( + ARRAY( + SELECT + ROUND(POW(2, (max_bucket - min_bucket) / num_buckets * val), 2) + FROM + UNNEST(GENERATE_ARRAY(0, num_buckets - 1)) AS val + ) +); SELECT assert.array_equals([1, 2, 4, 8], glam.histogram_generate_scalar_buckets(0, LOG(16, 2), 4)), diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/mozfun/glam/percentile/udf.sql /tmp/workspace/generated-sql/sql/mozfun/glam/percentile/udf.sql --- /tmp/workspace/main-generated-sql/sql/mozfun/glam/percentile/udf.sql 2024-06-06 15:49:20.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/mozfun/glam/percentile/udf.sql 2024-06-06 15:50:10.000000000 +0000 @@ -1,37 +1,41 @@ -- udf_js.glean_percentile CREATE OR REPLACE FUNCTION glam.percentile( - percentile FLOAT64, + pct FLOAT64, histogram ARRAY>, type STRING ) -RETURNS FLOAT64 DETERMINISTIC -LANGUAGE js -AS - ''' - if (percentile < 0 || percentile > 100) { - throw "percentile must be a value between 0 and 100"; - } - - let values = histogram.map(bucket => bucket.value); - let total = values.reduce((a, b) => a + b); - let normalized = values.map(value => value / total); - - // Find the index into the cumulative distribution function that corresponds - // to the percentile. This undershoots the true value of the percentile. - let acc = 0; - let index = null; - for (let i = 0; i < normalized.length; i++) { - acc += normalized[i]; - index = i; - if (acc >= percentile / 100) { - break; - } - } - - // NOTE: we do not perform geometric or linear interpolation, but this would - // be the place to implement it. - return histogram[index].key; -'''; +RETURNS FLOAT64 AS ( + ( + WITH check AS ( + SELECT + IF( + pct >= 0 + AND pct <= 100, + TRUE, + ERROR('percentile must be a value between 0 and 100') + ) pct_ok + ), + keyed_cum_sum AS ( + SELECT + key, + SUM(value) OVER (ORDER BY CAST(key AS FLOAT64)) / SUM(value) OVER () AS cum_sum + FROM + UNNEST(histogram) + ) + SELECT + CAST(key AS FLOAT64) + FROM + keyed_cum_sum, + check + WHERE + check.pct_ok + AND cum_sum >= pct / 100 + ORDER BY + cum_sum + LIMIT + 1 + ) +); SELECT assert.equals( @@ -41,6 +45,30 @@ ARRAY>[("0", 1), ("2", 2), ("3", 1)], "timing_distribution" ) + ), + assert.equals( + 3, + glam.percentile( + 100.0, + ARRAY>[("0", 1), ("2", 2), ("3", 1)], + "timing_distribution" + ) + ), + assert.equals( + 0, + glam.percentile( + 0.0, + ARRAY>[("0", 1), ("2", 2), ("3", 1)], + "timing_distribution" + ) + ), + assert.equals( + 2, + glam.percentile( + 2.0, + ARRAY>[("0", 1), ("2", 2), ("10", 10), ("11", 100)], + "timing_distribution" + ) ); #xfail diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/bedrock_derived/event_monitoring_live_v1/materialized_view.sql /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/bedrock_derived/event_monitoring_live_v1/materialized_view.sql --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/bedrock_derived/event_monitoring_live_v1/materialized_view.sql 2024-06-06 15:50:01.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/bedrock_derived/event_monitoring_live_v1/materialized_view.sql 2024-06-06 15:52:08.000000000 +0000 @@ -50,7 +50,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.bedrock_live.interaction_v1` + `moz-fx-data-shared-prod.bedrock_live.events_v1` UNION ALL SELECT submission_timestamp, @@ -60,7 +60,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.bedrock_live.non_interaction_v1` + `moz-fx-data-shared-prod.bedrock_live.interaction_v1` UNION ALL SELECT submission_timestamp, @@ -70,7 +70,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.bedrock_live.events_v1` + `moz-fx-data-shared-prod.bedrock_live.non_interaction_v1` ) CROSS JOIN UNNEST(events) AS event, diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/contextual_services/event_aggregates/schema.yaml /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/contextual_services/event_aggregates/schema.yaml --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/contextual_services/event_aggregates/schema.yaml 2024-06-06 15:49:19.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/contextual_services/event_aggregates/schema.yaml 2024-06-06 15:58:07.000000000 +0000 @@ -1,49 +1,49 @@ fields: -- mode: NULLABLE - name: submission_date +- name: submission_date type: DATE -- mode: NULLABLE - name: source + mode: NULLABLE +- name: source type: STRING -- mode: NULLABLE - name: event_type + mode: NULLABLE +- name: event_type type: STRING -- mode: NULLABLE - name: form_factor + mode: NULLABLE +- name: form_factor type: STRING -- mode: NULLABLE - name: country + mode: NULLABLE +- name: country type: STRING -- mode: NULLABLE - name: subdivision1 + mode: NULLABLE +- name: subdivision1 type: STRING -- mode: NULLABLE - name: advertiser + mode: NULLABLE +- name: advertiser type: STRING -- mode: NULLABLE - name: release_channel + mode: NULLABLE +- name: release_channel type: STRING -- mode: NULLABLE - name: position + mode: NULLABLE +- name: position type: INTEGER -- mode: NULLABLE - name: provider + mode: NULLABLE +- name: provider type: STRING -- mode: NULLABLE - name: match_type + mode: NULLABLE +- name: match_type type: STRING -- mode: NULLABLE - name: normalized_os + mode: NULLABLE +- name: normalized_os type: STRING -- mode: NULLABLE - name: suggest_data_sharing_enabled + mode: NULLABLE +- name: suggest_data_sharing_enabled type: BOOLEAN -- mode: NULLABLE - name: event_count + mode: NULLABLE +- name: event_count type: INTEGER -- mode: NULLABLE - name: user_count + mode: NULLABLE +- name: user_count type: INTEGER -- mode: NULLABLE - name: query_type + mode: NULLABLE +- name: query_type type: STRING + mode: NULLABLE diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/contextual_services/event_aggregates_suggest/schema.yaml /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/contextual_services/event_aggregates_suggest/schema.yaml --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/contextual_services/event_aggregates_suggest/schema.yaml 2024-06-06 15:49:19.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/contextual_services/event_aggregates_suggest/schema.yaml 2024-06-06 15:58:07.000000000 +0000 @@ -1,40 +1,40 @@ fields: -- mode: NULLABLE - name: submission_date +- name: submission_date type: DATE -- mode: NULLABLE - name: form_factor + mode: NULLABLE +- name: form_factor type: STRING -- mode: NULLABLE - name: country + mode: NULLABLE +- name: country type: STRING -- mode: NULLABLE - name: advertiser + mode: NULLABLE +- name: advertiser type: STRING -- mode: NULLABLE - name: normalized_os + mode: NULLABLE +- name: normalized_os type: STRING -- mode: NULLABLE - name: release_channel + mode: NULLABLE +- name: release_channel type: STRING -- mode: NULLABLE - name: position + mode: NULLABLE +- name: position type: INTEGER -- mode: NULLABLE - name: provider + mode: NULLABLE +- name: provider type: STRING -- mode: NULLABLE - name: match_type + mode: NULLABLE +- name: match_type type: STRING -- mode: NULLABLE - name: suggest_data_sharing_enabled + mode: NULLABLE +- name: suggest_data_sharing_enabled type: BOOLEAN -- mode: NULLABLE - name: impression_count + mode: NULLABLE +- name: impression_count type: INTEGER -- mode: NULLABLE - name: click_count + mode: NULLABLE +- name: click_count type: INTEGER -- mode: NULLABLE - name: query_type + mode: NULLABLE +- name: query_type type: STRING + mode: NULLABLE diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/fenix/funnel_retention_clients/schema.yaml /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/fenix/funnel_retention_clients/schema.yaml --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/fenix/funnel_retention_clients/schema.yaml 2024-06-06 15:49:19.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/fenix/funnel_retention_clients/schema.yaml 2024-06-06 15:58:11.000000000 +0000 @@ -26,6 +26,9 @@ - name: adjust_network type: STRING mode: NULLABLE +- name: install_source + type: STRING + mode: NULLABLE - name: retained_week_2 type: BOOLEAN mode: NULLABLE diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/fenix/funnel_retention_week_4/schema.yaml /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/fenix/funnel_retention_week_4/schema.yaml --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/fenix/funnel_retention_week_4/schema.yaml 2024-06-06 15:49:19.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/fenix/funnel_retention_week_4/schema.yaml 2024-06-06 15:58:11.000000000 +0000 @@ -48,6 +48,10 @@ description: 'The type of source of a client installation. ' +- name: install_source + type: STRING + mode: NULLABLE + description: null - name: new_profiles type: INTEGER mode: NULLABLE diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/firefox_desktop_background_tasks_derived/event_monitoring_live_v1/materialized_view.sql /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/firefox_desktop_background_tasks_derived/event_monitoring_live_v1/materialized_view.sql --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/firefox_desktop_background_tasks_derived/event_monitoring_live_v1/materialized_view.sql 2024-06-06 15:50:02.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/firefox_desktop_background_tasks_derived/event_monitoring_live_v1/materialized_view.sql 2024-06-06 15:52:09.000000000 +0000 @@ -50,7 +50,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.firefox_desktop_background_tasks_live.background_tasks_v1` + `moz-fx-data-shared-prod.firefox_desktop_background_tasks_live.events_v1` UNION ALL SELECT submission_timestamp, @@ -60,7 +60,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.firefox_desktop_background_tasks_live.events_v1` + `moz-fx-data-shared-prod.firefox_desktop_background_tasks_live.background_tasks_v1` ) CROSS JOIN UNNEST(events) AS event, diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/firefox_desktop_derived/event_monitoring_live_v1/materialized_view.sql /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/firefox_desktop_derived/event_monitoring_live_v1/materialized_view.sql --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/firefox_desktop_derived/event_monitoring_live_v1/materialized_view.sql 2024-06-06 15:50:01.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/firefox_desktop_derived/event_monitoring_live_v1/materialized_view.sql 2024-06-06 15:52:09.000000000 +0000 @@ -50,7 +50,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.firefox_desktop_live.newtab_v1` + `moz-fx-data-shared-prod.firefox_desktop_live.urlbar_potential_exposure_v1` UNION ALL SELECT submission_timestamp, @@ -70,7 +70,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.firefox_desktop_live.urlbar_potential_exposure_v1` + `moz-fx-data-shared-prod.firefox_desktop_live.events_v1` UNION ALL SELECT submission_timestamp, @@ -80,7 +80,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.firefox_desktop_live.events_v1` + `moz-fx-data-shared-prod.firefox_desktop_live.newtab_v1` ) CROSS JOIN UNNEST(events) AS event, diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/google_ads/conversion_event_categorization/schema.yaml /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/google_ads/conversion_event_categorization/schema.yaml --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/google_ads/conversion_event_categorization/schema.yaml 2024-06-06 15:50:01.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/google_ads/conversion_event_categorization/schema.yaml 2024-06-06 15:58:36.000000000 +0000 @@ -31,38 +31,27 @@ type: DATE mode: NULLABLE description: Report Date -- name: sent_main_ping_in_first_7_days - type: BOOLEAN +- name: first_main_ping_date + type: DATE mode: NULLABLE - description: Sent Main Ping In First 7 Days After First Seen Date Indicator - name: country type: STRING mode: NULLABLE - description: Country - name: dou type: INTEGER mode: NULLABLE - description: DOU - name: active_hours_sum type: FLOAT mode: NULLABLE - description: Active Hours Sum - name: search_with_ads_count_all type: INTEGER mode: NULLABLE - description: Search With Ads Count All - name: event_1 type: BOOLEAN mode: NULLABLE - description: Event 1 Indicator - 5 or more days of use and 1 or more search with - ads (strictest event) - name: event_2 type: BOOLEAN mode: NULLABLE - description: Event 2 Indicator - 3 or more days of use and 1 or more search with - ads (medium event) - name: event_3 type: BOOLEAN mode: NULLABLE - description: Event 3 Indicator - 3 or more days of use and 0.4 or more active hours - (most lenient event) diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/google_ads_derived/conversion_event_categorization_v1/query.sql /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/google_ads_derived/conversion_event_categorization_v1/query.sql --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/google_ads_derived/conversion_event_categorization_v1/query.sql 2024-06-06 15:49:20.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/google_ads_derived/conversion_event_categorization_v1/query.sql 2024-06-06 15:50:10.000000000 +0000 @@ -3,35 +3,27 @@ --Note: Max cohort date cannot be more than 7 days ago (to ensure we always have at least 7 days of data) WITH clients_first_seen_14_days_ago AS ( SELECT - cfs.client_id, - cfs.first_seen_date, - m.first_seen_date AS first_main_ping_date, - cfs.country, - cfs.attribution_campaign, - cfs.attribution_content, - cfs.attribution_dltoken, - cfs.attribution_medium, - cfs.attribution_source + client_id, + first_seen_date, + country, + attribution_campaign, + attribution_content, + attribution_dltoken, + attribution_medium, + attribution_source FROM - `moz-fx-data-shared-prod.telemetry.clients_first_seen` cfs --contains all new clients, including those that never sent a main ping - LEFT JOIN - `moz-fx-data-shared-prod.telemetry_derived.clients_first_seen_v1` m -- the "old" CFS table, contains the date of the client's *first main ping* - ON cfs.client_id = m.client_id - AND m.first_seen_date - -- join so that we only get "first main ping" dates from clients that sent their first main ping within -1 and +6 days from their first_seen_date. - -- we will miss ~5% of clients that send their first main ping later, this is a trade-off we make to have a two-week reporting cadence (one week to send their first main ping, then we report on the outcomes *one week after that* - BETWEEN DATE_SUB(cfs.first_seen_date, INTERVAL 1 DAY) - AND DATE_ADD(cfs.first_seen_date, INTERVAL 6 DAY) + `moz-fx-data-shared-prod.telemetry.clients_first_seen` --contains all new clients, including those that never sent a main ping WHERE - cfs.first_seen_date = @report_date --this is 14 days before {{ds}} - AND cfs.first_seen_date >= '2023-11-01' + first_seen_date = @report_date --this is 14 days before {{ds}} + AND first_seen_date + BETWEEN '2023-11-01' + AND DATE_SUB(CURRENT_DATE, INTERVAL 8 DAY) ), --Step 2: Get only the columns we need from clients last seen, for only the small window of time we need clients_last_seen_raw AS ( SELECT cls.client_id, cls.first_seen_date, - clients.first_main_ping_date, cls.country, cls.submission_date, cls.days_since_seen, @@ -44,10 +36,15 @@ JOIN clients_first_seen_14_days_ago clients ON cls.client_id = clients.client_id + WHERE + cls.submission_date >= '2023-11-01' --first cohort date + AND cls.submission_date + BETWEEN cls.first_seen_date + AND DATE_ADD(cls.first_seen_date, INTERVAL 6 DAY) --get first 7 days from their first main ping + --to process less data, we only check for pings between @submission date - 15 days and submission date + 15 days for each date this runs AND cls.submission_date - -- join the clients_last_seen so that we get the first 7 days of each client's main ping records (for the clients that sent > 0 main pings in their first week) - BETWEEN clients.first_main_ping - AND DATE_ADD(clients.first_main_ping, INTERVAL 6 DAY) + BETWEEN DATE_SUB(@report_date, INTERVAL 1 DAY) --15 days before DS + AND DATE_ADD(@report_date, INTERVAL 29 DAY) --15 days after DS ), --STEP 2: For every client, get the first 7 days worth of main pings sent after their first main ping client_activity_first_7_days AS ( @@ -58,13 +55,13 @@ ) AS first_seen_date, --date we got first main ping (potentially different than above first seen date) ANY_VALUE( CASE - WHEN first_main_ping_date = submission_date + WHEN first_seen_date = submission_date THEN country END ) AS country, --any country from their first day in clients_last_seen ANY_VALUE( CASE - WHEN first_main_ping_date = DATE_ADD(first_seen_date, INTERVAL 6 DAY) + WHEN submission_date = DATE_ADD(first_seen_date, INTERVAL 6 DAY) THEN BIT_COUNT(days_visited_1_uri_bits & days_interacted_bits) END ) AS dou, --total # of days of activity during their first 7 days of main pings @@ -98,7 +95,7 @@ cfs.attribution_dltoken, cfs.attribution_medium, cfs.attribution_source, - cfs.first_main_ping_date, + IF(cls.first_seen_date IS NOT NULL, TRUE, FALSE) AS sent_main_ping_in_first_7_days, COALESCE( cls.country, cfs.country @@ -121,7 +118,7 @@ attribution_medium, attribution_source, @submission_date AS report_date, - first_main_ping_date, + sent_main_ping_in_first_7_days, country, dou, active_hours_sum, diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/google_ads_derived/conversion_event_categorization_v1/schema.yaml /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/google_ads_derived/conversion_event_categorization_v1/schema.yaml --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/google_ads_derived/conversion_event_categorization_v1/schema.yaml 2024-06-06 15:49:20.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/google_ads_derived/conversion_event_categorization_v1/schema.yaml 2024-06-06 15:50:10.000000000 +0000 @@ -32,9 +32,9 @@ type: DATE description: Report Date - mode: NULLABLE - name: first_main_ping_date - type: DATE - description: First Main Ping Date + name: sent_main_ping_in_first_7_days + type: BOOLEAN + description: Sent Main Ping In First 7 Days After First Seen Date Indicator - mode: NULLABLE name: country type: STRING diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/internet_outages/global_outages_v1/schema.yaml /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/internet_outages/global_outages_v1/schema.yaml --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/internet_outages/global_outages_v1/schema.yaml 2024-06-06 15:49:19.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/internet_outages/global_outages_v1/schema.yaml 2024-06-06 15:58:38.000000000 +0000 @@ -1,49 +1,49 @@ fields: -- mode: NULLABLE - name: country +- name: country type: STRING -- mode: NULLABLE - name: city + mode: NULLABLE +- name: city type: STRING -- mode: NULLABLE - name: datetime + mode: NULLABLE +- name: datetime type: TIMESTAMP -- mode: NULLABLE - name: proportion_undefined + mode: NULLABLE +- name: proportion_undefined type: FLOAT -- mode: NULLABLE - name: proportion_timeout + mode: NULLABLE +- name: proportion_timeout type: FLOAT -- mode: NULLABLE - name: proportion_abort + mode: NULLABLE +- name: proportion_abort type: FLOAT -- mode: NULLABLE - name: proportion_unreachable + mode: NULLABLE +- name: proportion_unreachable type: FLOAT -- mode: NULLABLE - name: proportion_terminated + mode: NULLABLE +- name: proportion_terminated type: FLOAT -- mode: NULLABLE - name: proportion_channel_open + mode: NULLABLE +- name: proportion_channel_open type: FLOAT -- mode: NULLABLE - name: avg_dns_success_time + mode: NULLABLE +- name: avg_dns_success_time type: FLOAT -- mode: NULLABLE - name: missing_dns_success + mode: NULLABLE +- name: missing_dns_success type: FLOAT -- mode: NULLABLE - name: avg_dns_failure_time + mode: NULLABLE +- name: avg_dns_failure_time type: FLOAT -- mode: NULLABLE - name: missing_dns_failure + mode: NULLABLE +- name: missing_dns_failure type: FLOAT -- mode: NULLABLE - name: count_dns_failure + mode: NULLABLE +- name: count_dns_failure type: FLOAT -- mode: NULLABLE - name: ssl_error_prop + mode: NULLABLE +- name: ssl_error_prop type: FLOAT -- mode: NULLABLE - name: avg_tls_handshake_time + mode: NULLABLE +- name: avg_tls_handshake_time type: FLOAT + mode: NULLABLE diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/klar_android/active_users_aggregates/metadata.yaml /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/klar_android/active_users_aggregates/metadata.yaml --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/klar_android/active_users_aggregates/metadata.yaml 2024-06-06 15:50:01.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/klar_android/active_users_aggregates/metadata.yaml 1970-01-01 00:00:00.000000000 +0000 @@ -1,13 +0,0 @@ -friendly_name: Active Users Aggregates -description: |- - Please provide a description for the query -owners: [] -labels: {} -bigquery: null -workgroup_access: -- role: roles/bigquery.dataViewer - members: - - workgroup:mozilla-confidential -references: - view.sql: - - moz-fx-data-shared-prod.klar_android_derived.active_users_aggregates_v3 diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/klar_android/active_users_aggregates/view.sql /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/klar_android/active_users_aggregates/view.sql --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/klar_android/active_users_aggregates/view.sql 2024-06-06 15:50:01.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/klar_android/active_users_aggregates/view.sql 1970-01-01 00:00:00.000000000 +0000 @@ -1,15 +0,0 @@ ---- User-facing view. Generated via sql_generators.active_users. -CREATE OR REPLACE VIEW - `moz-fx-data-shared-prod.klar_android.active_users_aggregates` -AS -SELECT - * EXCEPT (app_version, app_name), - app_name, - app_version, - `mozfun.norm.browser_version_info`(app_version).major_version AS app_version_major, - `mozfun.norm.browser_version_info`(app_version).minor_version AS app_version_minor, - `mozfun.norm.browser_version_info`(app_version).patch_revision AS app_version_patch_revision, - `mozfun.norm.browser_version_info`(app_version).is_major_release AS app_version_is_major_release, - `mozfun.norm.os`(os) AS os_grouped -FROM - `moz-fx-data-shared-prod.klar_android_derived.active_users_aggregates_v3` diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/klar_android_derived/active_users_aggregates_v3/checks.sql /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/klar_android_derived/active_users_aggregates_v3/checks.sql --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/klar_android_derived/active_users_aggregates_v3/checks.sql 2024-06-06 15:50:02.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/klar_android_derived/active_users_aggregates_v3/checks.sql 1970-01-01 00:00:00.000000000 +0000 @@ -1,69 +0,0 @@ - - -#warn -WITH daily_users_sum AS ( - SELECT - SUM(daily_users), - FROM - `{{ project_id }}.{{ dataset_id }}.{{ table_name }}` - WHERE - submission_date = @submission_date - ), -distinct_client_count_base AS ( - SELECT - COUNT(DISTINCT client_info.client_id) AS distinct_client_count, - FROM - `moz-fx-data-shared-prod.org_mozilla_klar_live.baseline_v1` - WHERE - DATE(submission_timestamp) = @submission_date - ), -distinct_client_count AS ( - SELECT - SUM(distinct_client_count) - FROM - distinct_client_count_base -) -SELECT - IF( - ABS((SELECT * FROM daily_users_sum) - (SELECT * FROM distinct_client_count)) > 10, - ERROR( - CONCAT( - "Daily users mismatch between the klar_android live across all channels (`moz-fx-data-shared-prod.org_mozilla_klar_live.baseline_v1`,) and active_users_aggregates (`{{ dataset_id }}.{{ table_name }}`) tables is greater than 10.", - " Live table count: ", - (SELECT * FROM distinct_client_count), - " | active_users_aggregates (daily_users): ", - (SELECT * FROM daily_users_sum), - " | Delta detected: ", - ABS((SELECT * FROM daily_users_sum) - (SELECT * FROM distinct_client_count)) - ) - ), - NULL - ); - -#fail -WITH dau_current AS ( - SELECT - SUM(dau) AS dau - FROM - `{{ project_id }}.{{ dataset_id }}.{{ table_name }}` - WHERE - submission_date = @submission_date -), -dau_previous AS ( - SELECT - SUM(dau) AS dau - FROM - `{{ project_id }}.{{ dataset_id }}.{{ table_name }}` - WHERE - submission_date = DATE_SUB(@submission_date, INTERVAL 1 DAY) -) -SELECT - IF( - ABS((SELECT SUM(dau) FROM dau_current) / (SELECT SUM(dau) FROM dau_previous)) > 1.5, - ERROR( - "Current date's DAU is 50% higher than in previous date. See source table (`{{ project_id }}.{{ dataset_id }}.{{ table_name }}`)!" - ), - NULL - ); - - diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/klar_android_derived/active_users_aggregates_v3/metadata.yaml /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/klar_android_derived/active_users_aggregates_v3/metadata.yaml --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/klar_android_derived/active_users_aggregates_v3/metadata.yaml 2024-06-06 15:50:02.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/klar_android_derived/active_users_aggregates_v3/metadata.yaml 1970-01-01 00:00:00.000000000 +0000 @@ -1,54 +0,0 @@ -friendly_name: Klar Android Active Users Aggregates -description: |- - This table contains daily/weekly/monthly active users, - new profiles, searches and ad_clicks for Klar Android, - aggregated by submission_date, attribution, channel, - country, city, device model, distribution_id, os details - and activity segment. - - - dau is counting the users who reported a ping on the date and - are qualified as active users. - - daily_users counts all the users who reported a ping on the date. - Only dau is exposed in the view telemetry.active_users_aggregates. - - The table is labeled as "change_controlled", which implies - that changes require the approval of at least one owner. - - Proposal: - https://docs.google.com/document/d/1qvWO49Lr_Z_WErh3I3058A3B1YuiuURx19K3aTdmejM/edit?usp=sharing -owners: -- lvargas@mozilla.com -- mozilla/kpi_table_reviewers -labels: - incremental: true - change_controlled: true - dag: bqetl_analytics_aggregations - owner1: lvargas -scheduling: - dag_name: bqetl_analytics_aggregations - task_name: klar_android_active_users_aggregates - date_partition_offset: -1 -bigquery: - time_partitioning: - type: day - field: submission_date - require_partition_filter: true - expiration_days: null - range_partitioning: null - clustering: - fields: - - country - - app_name - - attribution_medium - - channel -workgroup_access: -- role: roles/bigquery.dataViewer - members: - - workgroup:mozilla-confidential -references: - checks.sql: - - .. - - moz-fx-data-shared-prod.org_mozilla_klar_live.baseline_v1 - query.sql: - - moz-fx-data-shared-prod.klar_android.active_users - - moz-fx-data-shared-prod.klar_android.metrics_clients_last_seen diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/klar_android_derived/active_users_aggregates_v3/query.sql /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/klar_android_derived/active_users_aggregates_v3/query.sql --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/klar_android_derived/active_users_aggregates_v3/query.sql 2024-06-06 15:50:02.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/klar_android_derived/active_users_aggregates_v3/query.sql 1970-01-01 00:00:00.000000000 +0000 @@ -1,170 +0,0 @@ ---- Query generated via sql_generators.active_users. -WITH baseline AS ( - SELECT - submission_date, - normalized_channel, - client_id, - days_active_bits, - days_created_profile_bits, - normalized_os, - normalized_os_version, - locale, - city, - country, - app_display_version, - device_model, - first_seen_date, - submission_date = first_seen_date AS is_new_profile, - CAST(NULL AS string) AS distribution_id, - isp, - app_name, - activity_segment AS segment, - is_daily_user, - is_weekly_user, - is_monthly_user, - is_dau, - is_wau, - is_mau - FROM - `moz-fx-data-shared-prod.klar_android.active_users` - WHERE - submission_date = @submission_date -), -metrics AS ( - -- Metrics ping may arrive in the same or next day as the baseline ping. - SELECT - client_id, - ARRAY_AGG(normalized_channel IGNORE NULLS ORDER BY submission_date ASC)[ - SAFE_OFFSET(0) - ] AS normalized_channel, - CAST(NULL AS INTEGER) AS uri_count, - CAST(NULL AS INTEGER) AS is_default_browser, - FROM - `moz-fx-data-shared-prod.klar_android.metrics_clients_last_seen` - WHERE - DATE(submission_date) - BETWEEN @submission_date - AND DATE_ADD(@submission_date, INTERVAL 1 DAY) - GROUP BY - client_id -), -unioned AS ( - SELECT - baseline.client_id, - baseline.segment, - baseline.app_name, - baseline.app_display_version AS app_version, - baseline.normalized_channel, - IFNULL(baseline.country, '??') country, - baseline.city, - baseline.days_created_profile_bits, - baseline.device_model, - baseline.isp, - baseline.is_new_profile, - baseline.locale, - baseline.first_seen_date, - baseline.normalized_os, - baseline.normalized_os_version, - COALESCE( - SAFE_CAST(NULLIF(SPLIT(baseline.normalized_os_version, ".")[SAFE_OFFSET(0)], "") AS INTEGER), - 0 - ) AS os_version_major, - COALESCE( - SAFE_CAST(NULLIF(SPLIT(baseline.normalized_os_version, ".")[SAFE_OFFSET(1)], "") AS INTEGER), - 0 - ) AS os_version_minor, - COALESCE( - SAFE_CAST(NULLIF(SPLIT(baseline.normalized_os_version, ".")[SAFE_OFFSET(2)], "") AS INTEGER), - 0 - ) AS os_version_patch, - baseline.submission_date, - metrics.uri_count, - metrics.is_default_browser, - baseline.distribution_id, - CAST(NULL AS string) AS attribution_content, - CAST(NULL AS string) AS attribution_source, - CAST(NULL AS string) AS attribution_medium, - CAST(NULL AS string) AS attribution_campaign, - CAST(NULL AS string) AS attribution_experiment, - CAST(NULL AS string) AS attribution_variation, - CAST(NULL AS FLOAT64) AS active_hours_sum, - is_daily_user, - is_weekly_user, - is_monthly_user, - is_dau, - is_wau, - is_mau - FROM - baseline - LEFT JOIN - metrics - ON baseline.client_id = metrics.client_id - AND baseline.normalized_channel IS NOT DISTINCT FROM metrics.normalized_channel -), -unioned_with_attribution AS ( - SELECT - unioned.*, - CAST(NULL AS STRING) AS install_source, - CAST(NULL AS STRING) AS adjust_network - FROM - unioned -), -todays_metrics AS ( - SELECT - segment, - app_version, - attribution_medium, - attribution_source, - attribution_medium IS NOT NULL - OR attribution_source IS NOT NULL AS attributed, - city, - country, - distribution_id, - EXTRACT(YEAR FROM first_seen_date) AS first_seen_year, - is_default_browser, - COALESCE(REGEXP_EXTRACT(locale, r'^(.+?)-'), locale, NULL) AS locale, - app_name AS app_name, - normalized_channel AS channel, - normalized_os AS os, - normalized_os_version AS os_version, - os_version_major, - os_version_minor, - submission_date, - client_id, - uri_count, - active_hours_sum, - adjust_network, - install_source, - is_daily_user, - is_weekly_user, - is_monthly_user, - is_dau, - is_wau, - is_mau - FROM - unioned_with_attribution -) -SELECT - todays_metrics.* EXCEPT ( - client_id, - is_daily_user, - is_weekly_user, - is_monthly_user, - is_dau, - is_wau, - is_mau, - uri_count, - active_hours_sum - ), - COUNTIF(is_daily_user) AS daily_users, - COUNTIF(is_weekly_user) AS weekly_users, - COUNTIF(is_monthly_user) AS monthly_users, - COUNTIF(is_dau) AS dau, - COUNTIF(is_wau) AS wau, - COUNTIF(is_mau) AS mau, - SUM(uri_count) AS uri_count, - SUM(active_hours_sum) AS active_hours, -FROM - todays_metrics -GROUP BY - ALL diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/klar_android_derived/active_users_aggregates_v3/schema.yaml /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/klar_android_derived/active_users_aggregates_v3/schema.yaml --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/klar_android_derived/active_users_aggregates_v3/schema.yaml 2024-06-06 15:50:02.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/klar_android_derived/active_users_aggregates_v3/schema.yaml 1970-01-01 00:00:00.000000000 +0000 @@ -1,85 +0,0 @@ -fields: -- name: segment - type: STRING - mode: NULLABLE -- name: app_version - type: STRING - mode: NULLABLE -- name: attribution_medium - type: STRING - mode: NULLABLE -- name: attribution_source - type: STRING - mode: NULLABLE -- name: attributed - type: BOOLEAN - mode: NULLABLE -- name: city - type: STRING - mode: NULLABLE -- name: country - type: STRING - mode: NULLABLE -- name: distribution_id - type: STRING - mode: NULLABLE -- name: first_seen_year - type: INTEGER - mode: NULLABLE -- name: is_default_browser - type: BOOLEAN - mode: NULLABLE -- name: locale - type: STRING - mode: NULLABLE -- name: app_name - type: STRING - mode: NULLABLE -- name: channel - type: STRING - mode: NULLABLE -- name: os - type: STRING - mode: NULLABLE -- name: os_version - type: STRING - mode: NULLABLE -- name: os_version_major - type: INTEGER - mode: NULLABLE -- name: os_version_minor - type: INTEGER - mode: NULLABLE -- name: submission_date - type: DATE - mode: NULLABLE -- name: adjust_network - type: STRING - mode: NULLABLE -- name: install_source - type: STRING - mode: NULLABLE -- name: daily_users - type: INTEGER - mode: NULLABLE -- name: weekly_users - type: INTEGER - mode: NULLABLE -- name: monthly_users - type: INTEGER - mode: NULLABLE -- name: dau - type: INTEGER - mode: NULLABLE -- name: wau - type: INTEGER - mode: NULLABLE -- name: mau - type: INTEGER - mode: NULLABLE -- name: uri_count - type: INTEGER - mode: NULLABLE -- name: active_hours - type: FLOAT64 - mode: NULLABLE diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/monitoring_derived/event_monitoring_aggregates_v1/query.sql /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/monitoring_derived/event_monitoring_aggregates_v1/query.sql --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/monitoring_derived/event_monitoring_aggregates_v1/query.sql 2024-06-06 15:50:01.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/monitoring_derived/event_monitoring_aggregates_v1/query.sql 2024-06-06 15:53:44.000000000 +0000 @@ -45,7 +45,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.firefox_desktop_stable.newtab_v1` + `moz-fx-data-shared-prod.firefox_desktop_stable.urlbar_potential_exposure_v1` UNION ALL SELECT submission_timestamp, @@ -65,7 +65,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.firefox_desktop_stable.urlbar_potential_exposure_v1` + `moz-fx-data-shared-prod.firefox_desktop_stable.events_v1` UNION ALL SELECT submission_timestamp, @@ -75,7 +75,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.firefox_desktop_stable.events_v1` + `moz-fx-data-shared-prod.firefox_desktop_stable.newtab_v1` ) CROSS JOIN UNNEST(events) AS event, @@ -572,7 +572,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.org_mozilla_ios_firefox_stable.first_session_v1` + `moz-fx-data-shared-prod.org_mozilla_ios_firefox_stable.events_v1` UNION ALL SELECT submission_timestamp, @@ -582,7 +582,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.org_mozilla_ios_firefox_stable.metrics_v1` + `moz-fx-data-shared-prod.org_mozilla_ios_firefox_stable.first_session_v1` UNION ALL SELECT submission_timestamp, @@ -592,7 +592,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.org_mozilla_ios_firefox_stable.events_v1` + `moz-fx-data-shared-prod.org_mozilla_ios_firefox_stable.metrics_v1` ) CROSS JOIN UNNEST(events) AS event, @@ -663,7 +663,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.org_mozilla_ios_firefoxbeta_stable.first_session_v1` + `moz-fx-data-shared-prod.org_mozilla_ios_firefoxbeta_stable.events_v1` UNION ALL SELECT submission_timestamp, @@ -673,7 +673,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.org_mozilla_ios_firefoxbeta_stable.metrics_v1` + `moz-fx-data-shared-prod.org_mozilla_ios_firefoxbeta_stable.first_session_v1` UNION ALL SELECT submission_timestamp, @@ -683,7 +683,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.org_mozilla_ios_firefoxbeta_stable.events_v1` + `moz-fx-data-shared-prod.org_mozilla_ios_firefoxbeta_stable.metrics_v1` ) CROSS JOIN UNNEST(events) AS event, @@ -754,7 +754,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.org_mozilla_ios_fennec_stable.first_session_v1` + `moz-fx-data-shared-prod.org_mozilla_ios_fennec_stable.events_v1` UNION ALL SELECT submission_timestamp, @@ -764,7 +764,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.org_mozilla_ios_fennec_stable.metrics_v1` + `moz-fx-data-shared-prod.org_mozilla_ios_fennec_stable.first_session_v1` UNION ALL SELECT submission_timestamp, @@ -774,7 +774,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.org_mozilla_ios_fennec_stable.events_v1` + `moz-fx-data-shared-prod.org_mozilla_ios_fennec_stable.metrics_v1` ) CROSS JOIN UNNEST(events) AS event, @@ -1555,7 +1555,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.mozillavpn_stable.main_v1` + `moz-fx-data-shared-prod.mozillavpn_stable.daemonsession_v1` UNION ALL SELECT submission_timestamp, @@ -1575,7 +1575,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.mozillavpn_stable.daemonsession_v1` + `moz-fx-data-shared-prod.mozillavpn_stable.main_v1` ) CROSS JOIN UNNEST(events) AS event, @@ -1646,7 +1646,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.org_mozilla_firefox_vpn_stable.main_v1` + `moz-fx-data-shared-prod.org_mozilla_firefox_vpn_stable.daemonsession_v1` UNION ALL SELECT submission_timestamp, @@ -1666,7 +1666,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.org_mozilla_firefox_vpn_stable.daemonsession_v1` + `moz-fx-data-shared-prod.org_mozilla_firefox_vpn_stable.main_v1` ) CROSS JOIN UNNEST(events) AS event, @@ -1737,7 +1737,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.org_mozilla_ios_firefoxvpn_stable.main_v1` + `moz-fx-data-shared-prod.org_mozilla_ios_firefoxvpn_stable.daemonsession_v1` UNION ALL SELECT submission_timestamp, @@ -1757,7 +1757,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.org_mozilla_ios_firefoxvpn_stable.daemonsession_v1` + `moz-fx-data-shared-prod.org_mozilla_ios_firefoxvpn_stable.main_v1` ) CROSS JOIN UNNEST(events) AS event, @@ -1828,7 +1828,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.org_mozilla_ios_firefoxvpn_network_extension_stable.main_v1` + `moz-fx-data-shared-prod.org_mozilla_ios_firefoxvpn_network_extension_stable.daemonsession_v1` UNION ALL SELECT submission_timestamp, @@ -1848,7 +1848,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.org_mozilla_ios_firefoxvpn_network_extension_stable.daemonsession_v1` + `moz-fx-data-shared-prod.org_mozilla_ios_firefoxvpn_network_extension_stable.main_v1` ) CROSS JOIN UNNEST(events) AS event, @@ -1990,7 +1990,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.bedrock_stable.interaction_v1` + `moz-fx-data-shared-prod.bedrock_stable.events_v1` UNION ALL SELECT submission_timestamp, @@ -2000,7 +2000,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.bedrock_stable.non_interaction_v1` + `moz-fx-data-shared-prod.bedrock_stable.interaction_v1` UNION ALL SELECT submission_timestamp, @@ -2010,7 +2010,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.bedrock_stable.events_v1` + `moz-fx-data-shared-prod.bedrock_stable.non_interaction_v1` ) CROSS JOIN UNNEST(events) AS event, @@ -2081,7 +2081,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.viu_politica_stable.video_index_v1` + `moz-fx-data-shared-prod.v ```

⚠️ Only part of the diff is displayed.

Link to full diff

dataops-ci-bot commented 1 month ago

Integration report for "Fix histogram_cast_json"

sql.diff

Click to expand! ```diff diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/dags/bqetl_analytics_aggregations.py /tmp/workspace/generated-sql/dags/bqetl_analytics_aggregations.py --- /tmp/workspace/main-generated-sql/dags/bqetl_analytics_aggregations.py 2024-06-06 17:54:19.000000000 +0000 +++ /tmp/workspace/generated-sql/dags/bqetl_analytics_aggregations.py 2024-06-06 18:09:29.000000000 +0000 @@ -322,30 +322,6 @@ pool="DATA_ENG_EXTERNALTASKSENSOR", ) - wait_for_checks__fail_org_mozilla_klar_derived__baseline_clients_last_seen__v1 = ExternalTaskSensor( - task_id="wait_for_checks__fail_org_mozilla_klar_derived__baseline_clients_last_seen__v1", - external_dag_id="bqetl_glean_usage", - external_task_id="klar_android.checks__fail_org_mozilla_klar_derived__baseline_clients_last_seen__v1", - execution_delta=datetime.timedelta(seconds=8100), - check_existence=True, - mode="reschedule", - allowed_states=ALLOWED_STATES, - failed_states=FAILED_STATES, - pool="DATA_ENG_EXTERNALTASKSENSOR", - ) - - wait_for_klar_android_derived__metrics_clients_last_seen__v1 = ExternalTaskSensor( - task_id="wait_for_klar_android_derived__metrics_clients_last_seen__v1", - external_dag_id="bqetl_glean_usage", - external_task_id="klar_android.klar_android_derived__metrics_clients_last_seen__v1", - execution_delta=datetime.timedelta(seconds=8100), - check_existence=True, - mode="reschedule", - allowed_states=ALLOWED_STATES, - failed_states=FAILED_STATES, - pool="DATA_ENG_EXTERNALTASKSENSOR", - ) - wait_for_checks__fail_org_mozilla_ios_klar_derived__baseline_clients_last_seen__v1 = ExternalTaskSensor( task_id="wait_for_checks__fail_org_mozilla_ios_klar_derived__baseline_clients_last_seen__v1", external_dag_id="bqetl_glean_usage", @@ -562,37 +538,6 @@ checks__fail_focus_ios_derived__active_users_aggregates__v3 ) - checks__fail_klar_android_derived__active_users_aggregates__v3 = bigquery_dq_check( - task_id="checks__fail_klar_android_derived__active_users_aggregates__v3", - source_table='active_users_aggregates_v3${{ macros.ds_format(macros.ds_add(ds, -1), "%Y-%m-%d", "%Y%m%d") }}', - dataset_id="klar_android_derived", - project_id="moz-fx-data-shared-prod", - is_dq_check_fail=True, - owner="lvargas@mozilla.com", - email=[ - "gkaberere@mozilla.com", - "lvargas@mozilla.com", - "telemetry-alerts@mozilla.com", - ], - depends_on_past=False, - parameters=["submission_date:DATE:{{macros.ds_add(ds, -1)}}"], - retries=0, - ) - - with TaskGroup( - "checks__fail_klar_android_derived__active_users_aggregates__v3_external", - ) as checks__fail_klar_android_derived__active_users_aggregates__v3_external: - ExternalTaskMarker( - task_id="bqetl_search_dashboard__wait_for_checks__fail_klar_android_derived__active_users_aggregates__v3", - external_dag_id="bqetl_search_dashboard", - external_task_id="wait_for_checks__fail_klar_android_derived__active_users_aggregates__v3", - execution_date="{{ (execution_date - macros.timedelta(days=-1, seconds=85500)).isoformat() }}", - ) - - checks__fail_klar_android_derived__active_users_aggregates__v3_external.set_upstream( - checks__fail_klar_android_derived__active_users_aggregates__v3 - ) - checks__fail_klar_ios_derived__active_users_aggregates__v3 = bigquery_dq_check( task_id="checks__fail_klar_ios_derived__active_users_aggregates__v3", source_table='active_users_aggregates_v3${{ macros.ds_format(macros.ds_add(ds, -1), "%Y-%m-%d", "%Y%m%d") }}', @@ -709,23 +654,6 @@ retries=0, ) - checks__warn_klar_android_derived__active_users_aggregates__v3 = bigquery_dq_check( - task_id="checks__warn_klar_android_derived__active_users_aggregates__v3", - source_table='active_users_aggregates_v3${{ macros.ds_format(macros.ds_add(ds, -1), "%Y-%m-%d", "%Y%m%d") }}', - dataset_id="klar_android_derived", - project_id="moz-fx-data-shared-prod", - is_dq_check_fail=False, - owner="lvargas@mozilla.com", - email=[ - "gkaberere@mozilla.com", - "lvargas@mozilla.com", - "telemetry-alerts@mozilla.com", - ], - depends_on_past=False, - parameters=["submission_date:DATE:{{macros.ds_add(ds, -1)}}"], - retries=0, - ) - checks__warn_klar_ios_derived__active_users_aggregates__v3 = bigquery_dq_check( task_id="checks__warn_klar_ios_derived__active_users_aggregates__v3", source_table='active_users_aggregates_v3${{ macros.ds_format(macros.ds_add(ds, -1), "%Y-%m-%d", "%Y%m%d") }}', @@ -837,22 +765,6 @@ parameters=["submission_date:DATE:{{macros.ds_add(ds, -1)}}"], ) - klar_android_active_users_aggregates = bigquery_etl_query( - task_id="klar_android_active_users_aggregates", - destination_table='active_users_aggregates_v3${{ macros.ds_format(macros.ds_add(ds, -1), "%Y-%m-%d", "%Y%m%d") }}', - dataset_id="klar_android_derived", - project_id="moz-fx-data-shared-prod", - owner="lvargas@mozilla.com", - email=[ - "gkaberere@mozilla.com", - "lvargas@mozilla.com", - "telemetry-alerts@mozilla.com", - ], - date_partition_parameter=None, - depends_on_past=False, - parameters=["submission_date:DATE:{{macros.ds_add(ds, -1)}}"], - ) - klar_ios_active_users_aggregates = bigquery_etl_query( task_id="klar_ios_active_users_aggregates", destination_table='active_users_aggregates_v3${{ macros.ds_format(macros.ds_add(ds, -1), "%Y-%m-%d", "%Y%m%d") }}', @@ -945,10 +857,6 @@ focus_ios_active_users_aggregates ) - checks__fail_klar_android_derived__active_users_aggregates__v3.set_upstream( - klar_android_active_users_aggregates - ) - checks__fail_klar_ios_derived__active_users_aggregates__v3.set_upstream( klar_ios_active_users_aggregates ) @@ -997,10 +905,6 @@ focus_ios_active_users_aggregates ) - checks__warn_klar_android_derived__active_users_aggregates__v3.set_upstream( - klar_android_active_users_aggregates - ) - checks__warn_klar_ios_derived__active_users_aggregates__v3.set_upstream( klar_ios_active_users_aggregates ) @@ -1089,14 +993,6 @@ wait_for_focus_ios_derived__metrics_clients_last_seen__v1 ) - klar_android_active_users_aggregates.set_upstream( - wait_for_checks__fail_org_mozilla_klar_derived__baseline_clients_last_seen__v1 - ) - - klar_android_active_users_aggregates.set_upstream( - wait_for_klar_android_derived__metrics_clients_last_seen__v1 - ) - klar_ios_active_users_aggregates.set_upstream( wait_for_checks__fail_org_mozilla_ios_klar_derived__baseline_clients_last_seen__v1 ) diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/dags/bqetl_desktop_conv_evnt_categorization.py /tmp/workspace/generated-sql/dags/bqetl_desktop_conv_evnt_categorization.py --- /tmp/workspace/main-generated-sql/dags/bqetl_desktop_conv_evnt_categorization.py 2024-06-06 17:54:19.000000000 +0000 +++ /tmp/workspace/generated-sql/dags/bqetl_desktop_conv_evnt_categorization.py 2024-06-06 18:09:30.000000000 +0000 @@ -76,18 +76,6 @@ pool="DATA_ENG_EXTERNALTASKSENSOR", ) - wait_for_telemetry_derived__clients_first_seen__v1 = ExternalTaskSensor( - task_id="wait_for_telemetry_derived__clients_first_seen__v1", - external_dag_id="bqetl_main_summary", - external_task_id="telemetry_derived__clients_first_seen__v1", - execution_delta=datetime.timedelta(seconds=36000), - check_existence=True, - mode="reschedule", - allowed_states=ALLOWED_STATES, - failed_states=FAILED_STATES, - pool="DATA_ENG_EXTERNALTASKSENSOR", - ) - checks__warn_google_ads_derived__conversion_event_categorization__v1 = bigquery_dq_check( task_id="checks__warn_google_ads_derived__conversion_event_categorization__v1", source_table='conversion_event_categorization_v1${{ macros.ds_format(macros.ds_add(ds, -14), "%Y-%m-%d", "%Y%m%d") }}', @@ -126,7 +114,3 @@ google_ads_derived__conversion_event_categorization__v1.set_upstream( wait_for_checks__fail_telemetry_derived__clients_last_seen__v2 ) - - google_ads_derived__conversion_event_categorization__v1.set_upstream( - wait_for_telemetry_derived__clients_first_seen__v1 - ) diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/dags/bqetl_glean_usage.py /tmp/workspace/generated-sql/dags/bqetl_glean_usage.py --- /tmp/workspace/main-generated-sql/dags/bqetl_glean_usage.py 2024-06-06 17:54:19.000000000 +0000 +++ /tmp/workspace/generated-sql/dags/bqetl_glean_usage.py 2024-06-06 18:09:32.000000000 +0000 @@ -1191,13 +1191,6 @@ parent_group=task_group_klar_android, ) as checks__fail_org_mozilla_klar_derived__baseline_clients_last_seen__v1_external: ExternalTaskMarker( - task_id="bqetl_analytics_aggregations__wait_for_checks__fail_org_mozilla_klar_derived__baseline_clients_last_seen__v1", - external_dag_id="bqetl_analytics_aggregations", - external_task_id="wait_for_checks__fail_org_mozilla_klar_derived__baseline_clients_last_seen__v1", - execution_date="{{ (execution_date - macros.timedelta(days=-1, seconds=78300)).isoformat() }}", - ) - - ExternalTaskMarker( task_id="bqetl_mobile_kpi_metrics__wait_for_checks__fail_org_mozilla_klar_derived__baseline_clients_last_seen__v1", external_dag_id="bqetl_mobile_kpi_metrics", external_task_id="wait_for_checks__fail_org_mozilla_klar_derived__baseline_clients_last_seen__v1", @@ -2504,21 +2497,6 @@ task_group=task_group_klar_android, ) - with TaskGroup( - "klar_android_derived__metrics_clients_last_seen__v1_external", - parent_group=task_group_klar_android, - ) as klar_android_derived__metrics_clients_last_seen__v1_external: - ExternalTaskMarker( - task_id="bqetl_analytics_aggregations__wait_for_klar_android_derived__metrics_clients_last_seen__v1", - external_dag_id="bqetl_analytics_aggregations", - external_task_id="wait_for_klar_android_derived__metrics_clients_last_seen__v1", - execution_date="{{ (execution_date - macros.timedelta(days=-1, seconds=78300)).isoformat() }}", - ) - - klar_android_derived__metrics_clients_last_seen__v1_external.set_upstream( - klar_android_derived__metrics_clients_last_seen__v1 - ) - klar_ios_derived__clients_last_seen_joined__v1 = bigquery_etl_query( task_id="klar_ios_derived__clients_last_seen_joined__v1", destination_table="clients_last_seen_joined_v1", diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/dags/bqetl_main_summary.py /tmp/workspace/generated-sql/dags/bqetl_main_summary.py --- /tmp/workspace/main-generated-sql/dags/bqetl_main_summary.py 2024-06-06 17:54:19.000000000 +0000 +++ /tmp/workspace/generated-sql/dags/bqetl_main_summary.py 2024-06-06 18:09:27.000000000 +0000 @@ -498,20 +498,6 @@ priority_weight=80, ) - with TaskGroup( - "telemetry_derived__clients_first_seen__v1_external", - ) as telemetry_derived__clients_first_seen__v1_external: - ExternalTaskMarker( - task_id="bqetl_desktop_conv_evnt_categorization__wait_for_telemetry_derived__clients_first_seen__v1", - external_dag_id="bqetl_desktop_conv_evnt_categorization", - external_task_id="wait_for_telemetry_derived__clients_first_seen__v1", - execution_date="{{ (execution_date - macros.timedelta(days=-1, seconds=50400)).isoformat() }}", - ) - - telemetry_derived__clients_first_seen__v1_external.set_upstream( - telemetry_derived__clients_first_seen__v1 - ) - telemetry_derived__clients_last_seen__v1 = bigquery_etl_query( task_id="telemetry_derived__clients_last_seen__v1", destination_table="clients_last_seen_v1", diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/dags/bqetl_search_dashboard.py /tmp/workspace/generated-sql/dags/bqetl_search_dashboard.py --- /tmp/workspace/main-generated-sql/dags/bqetl_search_dashboard.py 2024-06-06 17:54:19.000000000 +0000 +++ /tmp/workspace/generated-sql/dags/bqetl_search_dashboard.py 2024-06-06 18:09:27.000000000 +0000 @@ -121,18 +121,6 @@ pool="DATA_ENG_EXTERNALTASKSENSOR", ) - wait_for_checks__fail_klar_android_derived__active_users_aggregates__v3 = ExternalTaskSensor( - task_id="wait_for_checks__fail_klar_android_derived__active_users_aggregates__v3", - external_dag_id="bqetl_analytics_aggregations", - external_task_id="checks__fail_klar_android_derived__active_users_aggregates__v3", - execution_delta=datetime.timedelta(seconds=900), - check_existence=True, - mode="reschedule", - allowed_states=ALLOWED_STATES, - failed_states=FAILED_STATES, - pool="DATA_ENG_EXTERNALTASKSENSOR", - ) - wait_for_checks__fail_klar_ios_derived__active_users_aggregates__v3 = ExternalTaskSensor( task_id="wait_for_checks__fail_klar_ios_derived__active_users_aggregates__v3", external_dag_id="bqetl_analytics_aggregations", @@ -259,10 +247,6 @@ ) search_derived__search_revenue_levers_daily__v1.set_upstream( - wait_for_checks__fail_klar_android_derived__active_users_aggregates__v3 - ) - - search_derived__search_revenue_levers_daily__v1.set_upstream( wait_for_checks__fail_klar_ios_derived__active_users_aggregates__v3 ) Only in /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/klar_android: active_users_aggregates Only in /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/klar_android_derived: active_users_aggregates_v3 diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/mozfun/glam/histogram_cast_json/udf.sql /tmp/workspace/generated-sql/sql/mozfun/glam/histogram_cast_json/udf.sql --- /tmp/workspace/main-generated-sql/sql/mozfun/glam/histogram_cast_json/udf.sql 2024-06-06 17:50:41.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/mozfun/glam/histogram_cast_json/udf.sql 2024-06-06 17:51:30.000000000 +0000 @@ -1,17 +1,25 @@ --- udf_js_flatten -CREATE OR REPLACE FUNCTION glam.histogram_cast_json( - histogram ARRAY> -) -RETURNS STRING DETERMINISTIC -LANGUAGE js -AS - ''' - let obj = {}; - histogram.map(r => { - obj[r.key] = parseFloat(r.value.toFixed(4)); - }); - return JSON.stringify(obj); -'''; +/* +Casts an ARRAY> histogram to a JSON string. +This implementation uses String concatenation instead of +BigQuery native JSON (TO_JSON / JSON_OBJECT) functions to +preserve order. +https://cloud.google.com/bigquery/docs/reference/standard-sql/json_functions#json_object +Order is important for GLAM histograms so other UDFs that +operate on them, such as glam.percentile, can work correctly. +*/ +CREATE OR REPLACE FUNCTION glam.histogram_cast_json(histogram ARRAY>) +RETURNS STRING AS ( + ( + SELECT + CONCAT( + '{', + STRING_AGG(CONCAT('"', key, '":', ROUND(value, 4)) ORDER BY CAST(key AS FLOAT64)), + '}' + ) + FROM + UNNEST(histogram) + ) +); SELECT assert.equals( @@ -19,4 +27,11 @@ glam.histogram_cast_json( ARRAY>[("0", 0.111111), ("1", 2.0 / 3), ("2", 0)] ) + ), + assert.equals( + '{"0":0.1111,"1":0.6667,"2":0,"10":100}', + glam.histogram_cast_json( + ARRAY>[("0", 0.111111), ("1", 2.0 / 3), ("10", 100), ("2", 0)] ) + ), + assert.equals('{}', glam.histogram_cast_json(ARRAY>[])), diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/mozfun/glam/histogram_generate_functional_buckets/udf.sql /tmp/workspace/generated-sql/sql/mozfun/glam/histogram_generate_functional_buckets/udf.sql --- /tmp/workspace/main-generated-sql/sql/mozfun/glam/histogram_generate_functional_buckets/udf.sql 2024-06-06 17:50:41.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/mozfun/glam/histogram_generate_functional_buckets/udf.sql 2024-06-06 17:51:30.000000000 +0000 @@ -4,34 +4,29 @@ buckets_per_magnitude INT64, range_max INT64 ) -RETURNS ARRAY DETERMINISTIC -LANGUAGE js -AS - ''' - function sample_to_bucket_index(sample) { - // Get the index of the sample - // https://github.com/mozilla/glean/blob/main/glean-core/src/histogram/functional.rs - let exponent = Math.pow(log_base, 1.0/buckets_per_magnitude); - return Math.ceil(Math.log(sample + 1) / Math.log(exponent)); - } - - let buckets = new Set([0]); - for (let index = 0; index < sample_to_bucket_index(range_max); index++) { - - // Avoid re-using the exponent due to floating point issues when carrying - // the `pow` operation e.g. `let exponent = ...; Math.pow(exponent, index)`. - let bucket = Math.floor(Math.pow(log_base, index/buckets_per_magnitude)); - - // NOTE: the sample_to_bucket_index implementation overshoots the true index, - // so we break out early if we hit the max bucket range. - if (bucket > range_max) { - break; - } - buckets.add(bucket); - } - - return [...buckets] -'''; +RETURNS ARRAY AS ( + ( + WITH bucket_indexes AS ( + -- Generate all bucket indexes + -- https://github.com/mozilla/glean/blob/main/glean-core/src/histogram/functional.rs + SELECT + GENERATE_ARRAY(0, CEIL(LOG(range_max + 1, log_base) * buckets_per_magnitude)) AS indexes + ), + buckets AS ( + SELECT + FLOOR(POW(log_base, (idx) / buckets_per_magnitude)) AS bucket + FROM + bucket_indexes, + UNNEST(indexes) AS idx + WHERE + FLOOR(POW(log_base, (idx) / buckets_per_magnitude)) <= range_max + ) + SELECT + ARRAY_CONCAT([0.0], ARRAY_AGG(DISTINCT(bucket) ORDER BY bucket)) + FROM + buckets + ) +); SELECT -- First 50 keys of a timing distribution diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/mozfun/glam/histogram_generate_linear_buckets/udf.sql /tmp/workspace/generated-sql/sql/mozfun/glam/histogram_generate_linear_buckets/udf.sql --- /tmp/workspace/main-generated-sql/sql/mozfun/glam/histogram_generate_linear_buckets/udf.sql 2024-06-06 17:50:41.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/mozfun/glam/histogram_generate_linear_buckets/udf.sql 2024-06-06 17:51:30.000000000 +0000 @@ -4,17 +4,17 @@ max FLOAT64, nBuckets FLOAT64 ) -RETURNS ARRAY DETERMINISTIC -LANGUAGE js -AS - ''' - let result = [0]; - for (let i = 1; i < Math.min(nBuckets, max, 10000); i++) { - let linearRange = (min * (nBuckets - 1 - i) + max * (i - 1)) / (nBuckets - 2); - result.push(Math.round(linearRange)); - } - return result; -'''; +RETURNS ARRAY AS ( + ARRAY_CONCAT( + [0.0], + ARRAY( + SELECT + ROUND((min * (nBuckets - 1 - i) + max * (i - 1)) / (nBuckets - 2)) + FROM + UNNEST(GENERATE_ARRAY(1, LEAST(nBuckets - 1, max, 10000))) AS i + ) + ) +); SELECT -- Buckets of CONTENT_FRAME_TIME_VSYNC diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/mozfun/glam/histogram_generate_scalar_buckets/udf.sql /tmp/workspace/generated-sql/sql/mozfun/glam/histogram_generate_scalar_buckets/udf.sql --- /tmp/workspace/main-generated-sql/sql/mozfun/glam/histogram_generate_scalar_buckets/udf.sql 2024-06-06 17:50:41.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/mozfun/glam/histogram_generate_scalar_buckets/udf.sql 2024-06-06 17:51:30.000000000 +0000 @@ -4,17 +4,14 @@ max_bucket FLOAT64, num_buckets INT64 ) -RETURNS ARRAY DETERMINISTIC -LANGUAGE js -AS - ''' - let bucket_size = (max_bucket - min_bucket) / num_buckets; - let buckets = new Set(); - for (let bucket = min_bucket; bucket < max_bucket; bucket += bucket_size) { - buckets.add(Math.pow(2, bucket).toFixed(2)); - } - return Array.from(buckets); -'''; +RETURNS ARRAY AS ( + ARRAY( + SELECT + ROUND(POW(2, (max_bucket - min_bucket) / num_buckets * val), 2) + FROM + UNNEST(GENERATE_ARRAY(0, num_buckets - 1)) AS val + ) +); SELECT assert.array_equals([1, 2, 4, 8], glam.histogram_generate_scalar_buckets(0, LOG(16, 2), 4)), diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/mozfun/glam/percentile/udf.sql /tmp/workspace/generated-sql/sql/mozfun/glam/percentile/udf.sql --- /tmp/workspace/main-generated-sql/sql/mozfun/glam/percentile/udf.sql 2024-06-06 17:50:41.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/mozfun/glam/percentile/udf.sql 2024-06-06 17:51:30.000000000 +0000 @@ -1,37 +1,41 @@ -- udf_js.glean_percentile CREATE OR REPLACE FUNCTION glam.percentile( - percentile FLOAT64, + pct FLOAT64, histogram ARRAY>, type STRING ) -RETURNS FLOAT64 DETERMINISTIC -LANGUAGE js -AS - ''' - if (percentile < 0 || percentile > 100) { - throw "percentile must be a value between 0 and 100"; - } - - let values = histogram.map(bucket => bucket.value); - let total = values.reduce((a, b) => a + b); - let normalized = values.map(value => value / total); - - // Find the index into the cumulative distribution function that corresponds - // to the percentile. This undershoots the true value of the percentile. - let acc = 0; - let index = null; - for (let i = 0; i < normalized.length; i++) { - acc += normalized[i]; - index = i; - if (acc >= percentile / 100) { - break; - } - } - - // NOTE: we do not perform geometric or linear interpolation, but this would - // be the place to implement it. - return histogram[index].key; -'''; +RETURNS FLOAT64 AS ( + ( + WITH check AS ( + SELECT + IF( + pct >= 0 + AND pct <= 100, + TRUE, + ERROR('percentile must be a value between 0 and 100') + ) pct_ok + ), + keyed_cum_sum AS ( + SELECT + key, + SUM(value) OVER (ORDER BY CAST(key AS FLOAT64)) / SUM(value) OVER () AS cum_sum + FROM + UNNEST(histogram) + ) + SELECT + CAST(key AS FLOAT64) + FROM + keyed_cum_sum, + check + WHERE + check.pct_ok + AND cum_sum >= pct / 100 + ORDER BY + cum_sum + LIMIT + 1 + ) +); SELECT assert.equals( @@ -41,6 +45,30 @@ ARRAY>[("0", 1), ("2", 2), ("3", 1)], "timing_distribution" ) + ), + assert.equals( + 3, + glam.percentile( + 100.0, + ARRAY>[("0", 1), ("2", 2), ("3", 1)], + "timing_distribution" + ) + ), + assert.equals( + 0, + glam.percentile( + 0.0, + ARRAY>[("0", 1), ("2", 2), ("3", 1)], + "timing_distribution" + ) + ), + assert.equals( + 2, + glam.percentile( + 2.0, + ARRAY>[("0", 1), ("2", 2), ("10", 10), ("11", 100)], + "timing_distribution" + ) ); #xfail diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/bedrock_derived/event_monitoring_live_v1/materialized_view.sql /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/bedrock_derived/event_monitoring_live_v1/materialized_view.sql --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/bedrock_derived/event_monitoring_live_v1/materialized_view.sql 2024-06-06 17:51:20.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/bedrock_derived/event_monitoring_live_v1/materialized_view.sql 2024-06-06 17:53:38.000000000 +0000 @@ -50,7 +50,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.bedrock_live.interaction_v1` + `moz-fx-data-shared-prod.bedrock_live.events_v1` UNION ALL SELECT submission_timestamp, @@ -60,7 +60,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.bedrock_live.non_interaction_v1` + `moz-fx-data-shared-prod.bedrock_live.interaction_v1` UNION ALL SELECT submission_timestamp, @@ -70,7 +70,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.bedrock_live.events_v1` + `moz-fx-data-shared-prod.bedrock_live.non_interaction_v1` ) CROSS JOIN UNNEST(events) AS event, diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/contextual_services/event_aggregates/schema.yaml /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/contextual_services/event_aggregates/schema.yaml --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/contextual_services/event_aggregates/schema.yaml 2024-06-06 17:50:40.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/contextual_services/event_aggregates/schema.yaml 2024-06-06 18:00:00.000000000 +0000 @@ -1,49 +1,49 @@ fields: -- mode: NULLABLE - name: submission_date +- name: submission_date type: DATE -- mode: NULLABLE - name: source + mode: NULLABLE +- name: source type: STRING -- mode: NULLABLE - name: event_type + mode: NULLABLE +- name: event_type type: STRING -- mode: NULLABLE - name: form_factor + mode: NULLABLE +- name: form_factor type: STRING -- mode: NULLABLE - name: country + mode: NULLABLE +- name: country type: STRING -- mode: NULLABLE - name: subdivision1 + mode: NULLABLE +- name: subdivision1 type: STRING -- mode: NULLABLE - name: advertiser + mode: NULLABLE +- name: advertiser type: STRING -- mode: NULLABLE - name: release_channel + mode: NULLABLE +- name: release_channel type: STRING -- mode: NULLABLE - name: position + mode: NULLABLE +- name: position type: INTEGER -- mode: NULLABLE - name: provider + mode: NULLABLE +- name: provider type: STRING -- mode: NULLABLE - name: match_type + mode: NULLABLE +- name: match_type type: STRING -- mode: NULLABLE - name: normalized_os + mode: NULLABLE +- name: normalized_os type: STRING -- mode: NULLABLE - name: suggest_data_sharing_enabled + mode: NULLABLE +- name: suggest_data_sharing_enabled type: BOOLEAN -- mode: NULLABLE - name: event_count + mode: NULLABLE +- name: event_count type: INTEGER -- mode: NULLABLE - name: user_count + mode: NULLABLE +- name: user_count type: INTEGER -- mode: NULLABLE - name: query_type + mode: NULLABLE +- name: query_type type: STRING + mode: NULLABLE diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/contextual_services/event_aggregates_suggest/schema.yaml /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/contextual_services/event_aggregates_suggest/schema.yaml --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/contextual_services/event_aggregates_suggest/schema.yaml 2024-06-06 17:50:40.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/contextual_services/event_aggregates_suggest/schema.yaml 2024-06-06 17:59:59.000000000 +0000 @@ -1,40 +1,40 @@ fields: -- mode: NULLABLE - name: submission_date +- name: submission_date type: DATE -- mode: NULLABLE - name: form_factor + mode: NULLABLE +- name: form_factor type: STRING -- mode: NULLABLE - name: country + mode: NULLABLE +- name: country type: STRING -- mode: NULLABLE - name: advertiser + mode: NULLABLE +- name: advertiser type: STRING -- mode: NULLABLE - name: normalized_os + mode: NULLABLE +- name: normalized_os type: STRING -- mode: NULLABLE - name: release_channel + mode: NULLABLE +- name: release_channel type: STRING -- mode: NULLABLE - name: position + mode: NULLABLE +- name: position type: INTEGER -- mode: NULLABLE - name: provider + mode: NULLABLE +- name: provider type: STRING -- mode: NULLABLE - name: match_type + mode: NULLABLE +- name: match_type type: STRING -- mode: NULLABLE - name: suggest_data_sharing_enabled + mode: NULLABLE +- name: suggest_data_sharing_enabled type: BOOLEAN -- mode: NULLABLE - name: impression_count + mode: NULLABLE +- name: impression_count type: INTEGER -- mode: NULLABLE - name: click_count + mode: NULLABLE +- name: click_count type: INTEGER -- mode: NULLABLE - name: query_type + mode: NULLABLE +- name: query_type type: STRING + mode: NULLABLE diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/fenix/funnel_retention_clients/schema.yaml /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/fenix/funnel_retention_clients/schema.yaml --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/fenix/funnel_retention_clients/schema.yaml 2024-06-06 17:50:40.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/fenix/funnel_retention_clients/schema.yaml 2024-06-06 18:00:26.000000000 +0000 @@ -26,6 +26,9 @@ - name: adjust_network type: STRING mode: NULLABLE +- name: install_source + type: STRING + mode: NULLABLE - name: retained_week_2 type: BOOLEAN mode: NULLABLE diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/fenix/funnel_retention_week_4/schema.yaml /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/fenix/funnel_retention_week_4/schema.yaml --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/fenix/funnel_retention_week_4/schema.yaml 2024-06-06 17:50:40.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/fenix/funnel_retention_week_4/schema.yaml 2024-06-06 18:00:27.000000000 +0000 @@ -48,6 +48,10 @@ description: 'The type of source of a client installation. ' +- name: install_source + type: STRING + mode: NULLABLE + description: null - name: new_profiles type: INTEGER mode: NULLABLE diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/firefox_desktop_derived/event_monitoring_live_v1/materialized_view.sql /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/firefox_desktop_derived/event_monitoring_live_v1/materialized_view.sql --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/firefox_desktop_derived/event_monitoring_live_v1/materialized_view.sql 2024-06-06 17:51:19.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/firefox_desktop_derived/event_monitoring_live_v1/materialized_view.sql 2024-06-06 17:53:39.000000000 +0000 @@ -50,7 +50,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.firefox_desktop_live.events_v1` + `moz-fx-data-shared-prod.firefox_desktop_live.urlbar_potential_exposure_v1` UNION ALL SELECT submission_timestamp, @@ -60,7 +60,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.firefox_desktop_live.urlbar_potential_exposure_v1` + `moz-fx-data-shared-prod.firefox_desktop_live.newtab_v1` UNION ALL SELECT submission_timestamp, @@ -70,7 +70,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.firefox_desktop_live.newtab_v1` + `moz-fx-data-shared-prod.firefox_desktop_live.prototype_no_code_events_v1` UNION ALL SELECT submission_timestamp, @@ -80,7 +80,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.firefox_desktop_live.prototype_no_code_events_v1` + `moz-fx-data-shared-prod.firefox_desktop_live.events_v1` ) CROSS JOIN UNNEST(events) AS event, diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/google_ads/conversion_event_categorization/schema.yaml /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/google_ads/conversion_event_categorization/schema.yaml --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/google_ads/conversion_event_categorization/schema.yaml 2024-06-06 17:51:20.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/google_ads/conversion_event_categorization/schema.yaml 2024-06-06 18:00:09.000000000 +0000 @@ -34,35 +34,24 @@ - name: first_main_ping_date type: DATE mode: NULLABLE - description: First Main Ping Date - name: country type: STRING mode: NULLABLE - description: Country - name: dou type: INTEGER mode: NULLABLE - description: DOU - name: active_hours_sum type: FLOAT mode: NULLABLE - description: Active Hours Sum - name: search_with_ads_count_all type: INTEGER mode: NULLABLE - description: Search With Ads Count All - name: event_1 type: BOOLEAN mode: NULLABLE - description: Event 1 Indicator - 5 or more days of use and 1 or more search with - ads (strictest event) - name: event_2 type: BOOLEAN mode: NULLABLE - description: Event 2 Indicator - 3 or more days of use and 1 or more search with - ads (medium event) - name: event_3 type: BOOLEAN mode: NULLABLE - description: Event 3 Indicator - 3 or more days of use and 0.4 or more active hours - (most lenient event) diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/google_ads_derived/conversion_event_categorization_v1/query.sql /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/google_ads_derived/conversion_event_categorization_v1/query.sql --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/google_ads_derived/conversion_event_categorization_v1/query.sql 2024-06-06 17:50:41.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/google_ads_derived/conversion_event_categorization_v1/query.sql 2024-06-06 17:51:30.000000000 +0000 @@ -3,35 +3,27 @@ --Note: Max cohort date cannot be more than 7 days ago (to ensure we always have at least 7 days of data) WITH clients_first_seen_14_days_ago AS ( SELECT - cfs.client_id, - cfs.first_seen_date, - m.first_seen_date AS first_main_ping_date, - cfs.country, - cfs.attribution_campaign, - cfs.attribution_content, - cfs.attribution_dltoken, - cfs.attribution_medium, - cfs.attribution_source + client_id, + first_seen_date, + country, + attribution_campaign, + attribution_content, + attribution_dltoken, + attribution_medium, + attribution_source FROM - `moz-fx-data-shared-prod.telemetry.clients_first_seen` cfs --contains all new clients, including those that never sent a main ping - LEFT JOIN - `moz-fx-data-shared-prod.telemetry_derived.clients_first_seen_v1` m -- the "old" CFS table, contains the date of the client's *first main ping* - ON cfs.client_id = m.client_id - AND m.first_seen_date - -- join so that we only get "first main ping" dates from clients that sent their first main ping within -1 and +6 days from their first_seen_date. - -- we will miss ~5% of clients that send their first main ping later, this is a trade-off we make to have a two-week reporting cadence (one week to send their first main ping, then we report on the outcomes *one week after that* - BETWEEN DATE_SUB(cfs.first_seen_date, INTERVAL 1 DAY) - AND DATE_ADD(cfs.first_seen_date, INTERVAL 6 DAY) + `moz-fx-data-shared-prod.telemetry.clients_first_seen` --contains all new clients, including those that never sent a main ping WHERE - cfs.first_seen_date = @report_date --this is 14 days before {{ds}} - AND cfs.first_seen_date >= '2023-11-01' + first_seen_date = @report_date --this is 14 days before {{ds}} + AND first_seen_date + BETWEEN '2023-11-01' + AND DATE_SUB(CURRENT_DATE, INTERVAL 8 DAY) ), --Step 2: Get only the columns we need from clients last seen, for only the small window of time we need clients_last_seen_raw AS ( SELECT cls.client_id, cls.first_seen_date, - clients.first_main_ping_date, cls.country, cls.submission_date, cls.days_since_seen, @@ -45,11 +37,14 @@ clients_first_seen_14_days_ago clients ON cls.client_id = clients.client_id WHERE - cls.submission_date - -- join the clients_last_seen so that we get the first 7 days of each client's main ping records (for the clients that sent > 0 main pings in their first week) - BETWEEN clients.first_main_ping_date - AND DATE_ADD(clients.first_main_ping_date, INTERVAL 6 DAY) - AND cls.submission_date >= DATE_SUB(@report_date, INTERVAL 1 DAY) + cls.submission_date >= '2023-11-01' --first cohort date + AND cls.submission_date + BETWEEN cls.first_seen_date + AND DATE_ADD(cls.first_seen_date, INTERVAL 6 DAY) --get first 7 days from their first main ping + --to process less data, we only check for pings between @submission date - 15 days and submission date + 15 days for each date this runs + AND cls.submission_date + BETWEEN DATE_SUB(@report_date, INTERVAL 1 DAY) --15 days before DS + AND DATE_ADD(@report_date, INTERVAL 29 DAY) --15 days after DS ), --STEP 2: For every client, get the first 7 days worth of main pings sent after their first main ping client_activity_first_7_days AS ( @@ -60,13 +55,13 @@ ) AS first_seen_date, --date we got first main ping (potentially different than above first seen date) ANY_VALUE( CASE - WHEN first_main_ping_date = submission_date + WHEN first_seen_date = submission_date THEN country END ) AS country, --any country from their first day in clients_last_seen ANY_VALUE( CASE - WHEN first_main_ping_date = DATE_ADD(first_seen_date, INTERVAL 6 DAY) + WHEN submission_date = DATE_ADD(first_seen_date, INTERVAL 6 DAY) THEN BIT_COUNT(days_visited_1_uri_bits & days_interacted_bits) END ) AS dou, --total # of days of activity during their first 7 days of main pings @@ -100,7 +95,7 @@ cfs.attribution_dltoken, cfs.attribution_medium, cfs.attribution_source, - cfs.first_main_ping_date, + IF(cls.first_seen_date IS NOT NULL, TRUE, FALSE) AS sent_main_ping_in_first_7_days, COALESCE( cls.country, cfs.country @@ -123,7 +118,7 @@ attribution_medium, attribution_source, @submission_date AS report_date, - first_main_ping_date, + sent_main_ping_in_first_7_days, country, dou, active_hours_sum, diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/google_ads_derived/conversion_event_categorization_v1/schema.yaml /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/google_ads_derived/conversion_event_categorization_v1/schema.yaml --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/google_ads_derived/conversion_event_categorization_v1/schema.yaml 2024-06-06 17:50:41.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/google_ads_derived/conversion_event_categorization_v1/schema.yaml 2024-06-06 17:51:30.000000000 +0000 @@ -32,9 +32,9 @@ type: DATE description: Report Date - mode: NULLABLE - name: first_main_ping_date - type: DATE - description: First Main Ping Date + name: sent_main_ping_in_first_7_days + type: BOOLEAN + description: Sent Main Ping In First 7 Days After First Seen Date Indicator - mode: NULLABLE name: country type: STRING diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/internet_outages/global_outages_v1/schema.yaml /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/internet_outages/global_outages_v1/schema.yaml --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/internet_outages/global_outages_v1/schema.yaml 2024-06-06 17:50:40.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/internet_outages/global_outages_v1/schema.yaml 2024-06-06 18:00:58.000000000 +0000 @@ -1,49 +1,49 @@ fields: -- mode: NULLABLE - name: country +- name: country type: STRING -- mode: NULLABLE - name: city + mode: NULLABLE +- name: city type: STRING -- mode: NULLABLE - name: datetime + mode: NULLABLE +- name: datetime type: TIMESTAMP -- mode: NULLABLE - name: proportion_undefined + mode: NULLABLE +- name: proportion_undefined type: FLOAT -- mode: NULLABLE - name: proportion_timeout + mode: NULLABLE +- name: proportion_timeout type: FLOAT -- mode: NULLABLE - name: proportion_abort + mode: NULLABLE +- name: proportion_abort type: FLOAT -- mode: NULLABLE - name: proportion_unreachable + mode: NULLABLE +- name: proportion_unreachable type: FLOAT -- mode: NULLABLE - name: proportion_terminated + mode: NULLABLE +- name: proportion_terminated type: FLOAT -- mode: NULLABLE - name: proportion_channel_open + mode: NULLABLE +- name: proportion_channel_open type: FLOAT -- mode: NULLABLE - name: avg_dns_success_time + mode: NULLABLE +- name: avg_dns_success_time type: FLOAT -- mode: NULLABLE - name: missing_dns_success + mode: NULLABLE +- name: missing_dns_success type: FLOAT -- mode: NULLABLE - name: avg_dns_failure_time + mode: NULLABLE +- name: avg_dns_failure_time type: FLOAT -- mode: NULLABLE - name: missing_dns_failure + mode: NULLABLE +- name: missing_dns_failure type: FLOAT -- mode: NULLABLE - name: count_dns_failure + mode: NULLABLE +- name: count_dns_failure type: FLOAT -- mode: NULLABLE - name: ssl_error_prop + mode: NULLABLE +- name: ssl_error_prop type: FLOAT -- mode: NULLABLE - name: avg_tls_handshake_time + mode: NULLABLE +- name: avg_tls_handshake_time type: FLOAT + mode: NULLABLE diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/klar_android/active_users_aggregates/metadata.yaml /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/klar_android/active_users_aggregates/metadata.yaml --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/klar_android/active_users_aggregates/metadata.yaml 2024-06-06 17:51:20.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/klar_android/active_users_aggregates/metadata.yaml 1970-01-01 00:00:00.000000000 +0000 @@ -1,13 +0,0 @@ -friendly_name: Active Users Aggregates -description: |- - Please provide a description for the query -owners: [] -labels: {} -bigquery: null -workgroup_access: -- role: roles/bigquery.dataViewer - members: - - workgroup:mozilla-confidential -references: - view.sql: - - moz-fx-data-shared-prod.klar_android_derived.active_users_aggregates_v3 diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/klar_android/active_users_aggregates/view.sql /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/klar_android/active_users_aggregates/view.sql --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/klar_android/active_users_aggregates/view.sql 2024-06-06 17:51:20.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/klar_android/active_users_aggregates/view.sql 1970-01-01 00:00:00.000000000 +0000 @@ -1,15 +0,0 @@ ---- User-facing view. Generated via sql_generators.active_users. -CREATE OR REPLACE VIEW - `moz-fx-data-shared-prod.klar_android.active_users_aggregates` -AS -SELECT - * EXCEPT (app_version, app_name), - app_name, - app_version, - `mozfun.norm.browser_version_info`(app_version).major_version AS app_version_major, - `mozfun.norm.browser_version_info`(app_version).minor_version AS app_version_minor, - `mozfun.norm.browser_version_info`(app_version).patch_revision AS app_version_patch_revision, - `mozfun.norm.browser_version_info`(app_version).is_major_release AS app_version_is_major_release, - `mozfun.norm.os`(os) AS os_grouped -FROM - `moz-fx-data-shared-prod.klar_android_derived.active_users_aggregates_v3` diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/klar_android_derived/active_users_aggregates_v3/checks.sql /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/klar_android_derived/active_users_aggregates_v3/checks.sql --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/klar_android_derived/active_users_aggregates_v3/checks.sql 2024-06-06 17:51:20.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/klar_android_derived/active_users_aggregates_v3/checks.sql 1970-01-01 00:00:00.000000000 +0000 @@ -1,69 +0,0 @@ - - -#warn -WITH daily_users_sum AS ( - SELECT - SUM(daily_users), - FROM - `{{ project_id }}.{{ dataset_id }}.{{ table_name }}` - WHERE - submission_date = @submission_date - ), -distinct_client_count_base AS ( - SELECT - COUNT(DISTINCT client_info.client_id) AS distinct_client_count, - FROM - `moz-fx-data-shared-prod.org_mozilla_klar_live.baseline_v1` - WHERE - DATE(submission_timestamp) = @submission_date - ), -distinct_client_count AS ( - SELECT - SUM(distinct_client_count) - FROM - distinct_client_count_base -) -SELECT - IF( - ABS((SELECT * FROM daily_users_sum) - (SELECT * FROM distinct_client_count)) > 10, - ERROR( - CONCAT( - "Daily users mismatch between the klar_android live across all channels (`moz-fx-data-shared-prod.org_mozilla_klar_live.baseline_v1`,) and active_users_aggregates (`{{ dataset_id }}.{{ table_name }}`) tables is greater than 10.", - " Live table count: ", - (SELECT * FROM distinct_client_count), - " | active_users_aggregates (daily_users): ", - (SELECT * FROM daily_users_sum), - " | Delta detected: ", - ABS((SELECT * FROM daily_users_sum) - (SELECT * FROM distinct_client_count)) - ) - ), - NULL - ); - -#fail -WITH dau_current AS ( - SELECT - SUM(dau) AS dau - FROM - `{{ project_id }}.{{ dataset_id }}.{{ table_name }}` - WHERE - submission_date = @submission_date -), -dau_previous AS ( - SELECT - SUM(dau) AS dau - FROM - `{{ project_id }}.{{ dataset_id }}.{{ table_name }}` - WHERE - submission_date = DATE_SUB(@submission_date, INTERVAL 1 DAY) -) -SELECT - IF( - ABS((SELECT SUM(dau) FROM dau_current) / (SELECT SUM(dau) FROM dau_previous)) > 1.5, - ERROR( - "Current date's DAU is 50% higher than in previous date. See source table (`{{ project_id }}.{{ dataset_id }}.{{ table_name }}`)!" - ), - NULL - ); - - diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/klar_android_derived/active_users_aggregates_v3/metadata.yaml /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/klar_android_derived/active_users_aggregates_v3/metadata.yaml --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/klar_android_derived/active_users_aggregates_v3/metadata.yaml 2024-06-06 17:51:20.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/klar_android_derived/active_users_aggregates_v3/metadata.yaml 1970-01-01 00:00:00.000000000 +0000 @@ -1,54 +0,0 @@ -friendly_name: Klar Android Active Users Aggregates -description: |- - This table contains daily/weekly/monthly active users, - new profiles, searches and ad_clicks for Klar Android, - aggregated by submission_date, attribution, channel, - country, city, device model, distribution_id, os details - and activity segment. - - - dau is counting the users who reported a ping on the date and - are qualified as active users. - - daily_users counts all the users who reported a ping on the date. - Only dau is exposed in the view telemetry.active_users_aggregates. - - The table is labeled as "change_controlled", which implies - that changes require the approval of at least one owner. - - Proposal: - https://docs.google.com/document/d/1qvWO49Lr_Z_WErh3I3058A3B1YuiuURx19K3aTdmejM/edit?usp=sharing -owners: -- lvargas@mozilla.com -- mozilla/kpi_table_reviewers -labels: - incremental: true - change_controlled: true - dag: bqetl_analytics_aggregations - owner1: lvargas -scheduling: - dag_name: bqetl_analytics_aggregations - task_name: klar_android_active_users_aggregates - date_partition_offset: -1 -bigquery: - time_partitioning: - type: day - field: submission_date - require_partition_filter: true - expiration_days: null - range_partitioning: null - clustering: - fields: - - country - - app_name - - attribution_medium - - channel -workgroup_access: -- role: roles/bigquery.dataViewer - members: - - workgroup:mozilla-confidential -references: - checks.sql: - - .. - - moz-fx-data-shared-prod.org_mozilla_klar_live.baseline_v1 - query.sql: - - moz-fx-data-shared-prod.klar_android.active_users - - moz-fx-data-shared-prod.klar_android.metrics_clients_last_seen diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/klar_android_derived/active_users_aggregates_v3/query.sql /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/klar_android_derived/active_users_aggregates_v3/query.sql --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/klar_android_derived/active_users_aggregates_v3/query.sql 2024-06-06 17:51:20.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/klar_android_derived/active_users_aggregates_v3/query.sql 1970-01-01 00:00:00.000000000 +0000 @@ -1,170 +0,0 @@ ---- Query generated via sql_generators.active_users. -WITH baseline AS ( - SELECT - submission_date, - normalized_channel, - client_id, - days_active_bits, - days_created_profile_bits, - normalized_os, - normalized_os_version, - locale, - city, - country, - app_display_version, - device_model, - first_seen_date, - submission_date = first_seen_date AS is_new_profile, - CAST(NULL AS string) AS distribution_id, - isp, - app_name, - activity_segment AS segment, - is_daily_user, - is_weekly_user, - is_monthly_user, - is_dau, - is_wau, - is_mau - FROM - `moz-fx-data-shared-prod.klar_android.active_users` - WHERE - submission_date = @submission_date -), -metrics AS ( - -- Metrics ping may arrive in the same or next day as the baseline ping. - SELECT - client_id, - ARRAY_AGG(normalized_channel IGNORE NULLS ORDER BY submission_date ASC)[ - SAFE_OFFSET(0) - ] AS normalized_channel, - CAST(NULL AS INTEGER) AS uri_count, - CAST(NULL AS INTEGER) AS is_default_browser, - FROM - `moz-fx-data-shared-prod.klar_android.metrics_clients_last_seen` - WHERE - DATE(submission_date) - BETWEEN @submission_date - AND DATE_ADD(@submission_date, INTERVAL 1 DAY) - GROUP BY - client_id -), -unioned AS ( - SELECT - baseline.client_id, - baseline.segment, - baseline.app_name, - baseline.app_display_version AS app_version, - baseline.normalized_channel, - IFNULL(baseline.country, '??') country, - baseline.city, - baseline.days_created_profile_bits, - baseline.device_model, - baseline.isp, - baseline.is_new_profile, - baseline.locale, - baseline.first_seen_date, - baseline.normalized_os, - baseline.normalized_os_version, - COALESCE( - SAFE_CAST(NULLIF(SPLIT(baseline.normalized_os_version, ".")[SAFE_OFFSET(0)], "") AS INTEGER), - 0 - ) AS os_version_major, - COALESCE( - SAFE_CAST(NULLIF(SPLIT(baseline.normalized_os_version, ".")[SAFE_OFFSET(1)], "") AS INTEGER), - 0 - ) AS os_version_minor, - COALESCE( - SAFE_CAST(NULLIF(SPLIT(baseline.normalized_os_version, ".")[SAFE_OFFSET(2)], "") AS INTEGER), - 0 - ) AS os_version_patch, - baseline.submission_date, - metrics.uri_count, - metrics.is_default_browser, - baseline.distribution_id, - CAST(NULL AS string) AS attribution_content, - CAST(NULL AS string) AS attribution_source, - CAST(NULL AS string) AS attribution_medium, - CAST(NULL AS string) AS attribution_campaign, - CAST(NULL AS string) AS attribution_experiment, - CAST(NULL AS string) AS attribution_variation, - CAST(NULL AS FLOAT64) AS active_hours_sum, - is_daily_user, - is_weekly_user, - is_monthly_user, - is_dau, - is_wau, - is_mau - FROM - baseline - LEFT JOIN - metrics - ON baseline.client_id = metrics.client_id - AND baseline.normalized_channel IS NOT DISTINCT FROM metrics.normalized_channel -), -unioned_with_attribution AS ( - SELECT - unioned.*, - CAST(NULL AS STRING) AS install_source, - CAST(NULL AS STRING) AS adjust_network - FROM - unioned -), -todays_metrics AS ( - SELECT - segment, - app_version, - attribution_medium, - attribution_source, - attribution_medium IS NOT NULL - OR attribution_source IS NOT NULL AS attributed, - city, - country, - distribution_id, - EXTRACT(YEAR FROM first_seen_date) AS first_seen_year, - is_default_browser, - COALESCE(REGEXP_EXTRACT(locale, r'^(.+?)-'), locale, NULL) AS locale, - app_name AS app_name, - normalized_channel AS channel, - normalized_os AS os, - normalized_os_version AS os_version, - os_version_major, - os_version_minor, - submission_date, - client_id, - uri_count, - active_hours_sum, - adjust_network, - install_source, - is_daily_user, - is_weekly_user, - is_monthly_user, - is_dau, - is_wau, - is_mau - FROM - unioned_with_attribution -) -SELECT - todays_metrics.* EXCEPT ( - client_id, - is_daily_user, - is_weekly_user, - is_monthly_user, - is_dau, - is_wau, - is_mau, - uri_count, - active_hours_sum - ), - COUNTIF(is_daily_user) AS daily_users, - COUNTIF(is_weekly_user) AS weekly_users, - COUNTIF(is_monthly_user) AS monthly_users, - COUNTIF(is_dau) AS dau, - COUNTIF(is_wau) AS wau, - COUNTIF(is_mau) AS mau, - SUM(uri_count) AS uri_count, - SUM(active_hours_sum) AS active_hours, -FROM - todays_metrics -GROUP BY - ALL diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/klar_android_derived/active_users_aggregates_v3/schema.yaml /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/klar_android_derived/active_users_aggregates_v3/schema.yaml --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/klar_android_derived/active_users_aggregates_v3/schema.yaml 2024-06-06 17:51:20.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/klar_android_derived/active_users_aggregates_v3/schema.yaml 1970-01-01 00:00:00.000000000 +0000 @@ -1,85 +0,0 @@ -fields: -- name: segment - type: STRING - mode: NULLABLE -- name: app_version - type: STRING - mode: NULLABLE -- name: attribution_medium - type: STRING - mode: NULLABLE -- name: attribution_source - type: STRING - mode: NULLABLE -- name: attributed - type: BOOLEAN - mode: NULLABLE -- name: city - type: STRING - mode: NULLABLE -- name: country - type: STRING - mode: NULLABLE -- name: distribution_id - type: STRING - mode: NULLABLE -- name: first_seen_year - type: INTEGER - mode: NULLABLE -- name: is_default_browser - type: BOOLEAN - mode: NULLABLE -- name: locale - type: STRING - mode: NULLABLE -- name: app_name - type: STRING - mode: NULLABLE -- name: channel - type: STRING - mode: NULLABLE -- name: os - type: STRING - mode: NULLABLE -- name: os_version - type: STRING - mode: NULLABLE -- name: os_version_major - type: INTEGER - mode: NULLABLE -- name: os_version_minor - type: INTEGER - mode: NULLABLE -- name: submission_date - type: DATE - mode: NULLABLE -- name: adjust_network - type: STRING - mode: NULLABLE -- name: install_source - type: STRING - mode: NULLABLE -- name: daily_users - type: INTEGER - mode: NULLABLE -- name: weekly_users - type: INTEGER - mode: NULLABLE -- name: monthly_users - type: INTEGER - mode: NULLABLE -- name: dau - type: INTEGER - mode: NULLABLE -- name: wau - type: INTEGER - mode: NULLABLE -- name: mau - type: INTEGER - mode: NULLABLE -- name: uri_count - type: INTEGER - mode: NULLABLE -- name: active_hours - type: FLOAT64 - mode: NULLABLE diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/monitoring_derived/event_monitoring_aggregates_v1/query.sql /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/monitoring_derived/event_monitoring_aggregates_v1/query.sql --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/monitoring_derived/event_monitoring_aggregates_v1/query.sql 2024-06-06 17:51:20.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/monitoring_derived/event_monitoring_aggregates_v1/query.sql 2024-06-06 17:55:32.000000000 +0000 @@ -45,7 +45,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.firefox_desktop_stable.events_v1` + `moz-fx-data-shared-prod.firefox_desktop_stable.urlbar_potential_exposure_v1` UNION ALL SELECT submission_timestamp, @@ -55,7 +55,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.firefox_desktop_stable.urlbar_potential_exposure_v1` + `moz-fx-data-shared-prod.firefox_desktop_stable.newtab_v1` UNION ALL SELECT submission_timestamp, @@ -65,7 +65,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.firefox_desktop_stable.newtab_v1` + `moz-fx-data-shared-prod.firefox_desktop_stable.prototype_no_code_events_v1` UNION ALL SELECT submission_timestamp, @@ -75,7 +75,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.firefox_desktop_stable.prototype_no_code_events_v1` + `moz-fx-data-shared-prod.firefox_desktop_stable.events_v1` ) CROSS JOIN UNNEST(events) AS event, @@ -572,7 +572,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.org_mozilla_ios_firefox_stable.metrics_v1` + `moz-fx-data-shared-prod.org_mozilla_ios_firefox_stable.first_session_v1` UNION ALL SELECT submission_timestamp, @@ -582,7 +582,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.org_mozilla_ios_firefox_stable.first_session_v1` + `moz-fx-data-shared-prod.org_mozilla_ios_firefox_stable.events_v1` UNION ALL SELECT submission_timestamp, @@ -592,7 +592,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.org_mozilla_ios_firefox_stable.events_v1` + `moz-fx-data-shared-prod.org_mozilla_ios_firefox_stable.metrics_v1` ) CROSS JOIN UNNEST(events) AS event, @@ -663,7 +663,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.org_mozilla_ios_firefoxbeta_stable.metrics_v1` + `moz-fx-data-shared-prod.org_mozilla_ios_firefoxbeta_stable.first_session_v1` UNION ALL SELECT submission_timestamp, @@ -673,7 +673,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.org_mozilla_ios_firefoxbeta_stable.first_session_v1` + `moz-fx-data-shared-prod.org_mozilla_ios_firefoxbeta_stable.events_v1` UNION ALL SELECT submission_timestamp, @@ -683,7 +683,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.org_mozilla_ios_firefoxbeta_stable.events_v1` + `moz-fx-data-shared-prod.org_mozilla_ios_firefoxbeta_stable.metrics_v1` ) CROSS JOIN UNNEST(events) AS event, @@ -754,7 +754,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.org_mozilla_ios_fennec_stable.metrics_v1` + `moz-fx-data-shared-prod.org_mozilla_ios_fennec_stable.first_session_v1` UNION ALL SELECT submission_timestamp, @@ -764,7 +764,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.org_mozilla_ios_fennec_stable.first_session_v1` + `moz-fx-data-shared-prod.org_mozilla_ios_fennec_stable.events_v1` UNION ALL SELECT submission_timestamp, @@ -774,7 +774,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.org_mozilla_ios_fennec_stable.events_v1` + `moz-fx-data-shared-prod.org_mozilla_ios_fennec_stable.metrics_v1` ) CROSS JOIN UNNEST(events) AS event, @@ -1555,7 +1555,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.mozillavpn_stable.daemonsession_v1` + `moz-fx-data-shared-prod.mozillavpn_stable.main_v1` UNION ALL SELECT submission_timestamp, @@ -1565,7 +1565,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.mozillavpn_stable.main_v1` + `moz-fx-data-shared-prod.mozillavpn_stable.daemonsession_v1` UNION ALL SELECT submission_timestamp, @@ -1646,7 +1646,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.org_mozilla_firefox_vpn_stable.daemonsession_v1` + `moz-fx-data-shared-prod.org_mozilla_firefox_vpn_stable.main_v1` UNION ALL SELECT submission_timestamp, @@ -1656,7 +1656,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.org_mozilla_firefox_vpn_stable.main_v1` + `moz-fx-data-shared-prod.org_mozilla_firefox_vpn_stable.daemonsession_v1` UNION ALL SELECT submission_timestamp, @@ -1737,7 +1737,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.org_mozilla_ios_firefoxvpn_stable.daemonsession_v1` + `moz-fx-data-shared-prod.org_mozilla_ios_firefoxvpn_stable.main_v1` UNION ALL SELECT submission_timestamp, @@ -1747,7 +1747,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.org_mozilla_ios_firefoxvpn_stable.main_v1` + `moz-fx-data-shared-prod.org_mozilla_ios_firefoxvpn_stable.daemonsession_v1` UNION ALL SELECT submission_timestamp, @@ -1828,7 +1828,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.org_mozilla_ios_firefoxvpn_network_extension_stable.daemonsession_v1` + `moz-fx-data-shared-prod.org_mozilla_ios_firefoxvpn_network_extension_stable.main_v1` UNION ALL SELECT submission_timestamp, @@ -1838,7 +1838,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.org_mozilla_ios_firefoxvpn_network_extension_stable.main_v1` + `moz-fx-data-shared-prod.org_mozilla_ios_firefoxvpn_network_extension_stable.daemonsession_v1` UNION ALL SELECT submission_timestamp, @@ -1990,7 +1990,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.bedrock_stable.interaction_v1` + `moz-fx-data-shared-prod.bedrock_stable.events_v1` UNION ALL SELECT submission_timestamp, @@ -2000,7 +2000,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.bedrock_stable.non_interaction_v1` + `moz-fx-data-shared-prod.bedrock_stable.interaction_v1` UNION ALL SELECT submission_timestamp, @@ -2010,7 +2010,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.bedrock_stable.events_v1` + `moz-fx-data-shared-prod.bedrock_stable.non_interaction_v1` ) CROSS JOIN UNNEST(events) AS event, diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/mozillavpn_derived/event_monitoring_live_v1/materialized_view.sql /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/mozillavpn_derived/event_monitoring_live_v1/materialized_ ```

⚠️ Only part of the diff is displayed.

Link to full diff

dataops-ci-bot commented 1 month ago

Integration report for "Reformat"

sql.diff

Click to expand! ```diff diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/dags/bqetl_analytics_aggregations.py /tmp/workspace/generated-sql/dags/bqetl_analytics_aggregations.py --- /tmp/workspace/main-generated-sql/dags/bqetl_analytics_aggregations.py 2024-06-06 18:08:31.000000000 +0000 +++ /tmp/workspace/generated-sql/dags/bqetl_analytics_aggregations.py 2024-06-06 18:23:28.000000000 +0000 @@ -322,30 +322,6 @@ pool="DATA_ENG_EXTERNALTASKSENSOR", ) - wait_for_checks__fail_org_mozilla_klar_derived__baseline_clients_last_seen__v1 = ExternalTaskSensor( - task_id="wait_for_checks__fail_org_mozilla_klar_derived__baseline_clients_last_seen__v1", - external_dag_id="bqetl_glean_usage", - external_task_id="klar_android.checks__fail_org_mozilla_klar_derived__baseline_clients_last_seen__v1", - execution_delta=datetime.timedelta(seconds=8100), - check_existence=True, - mode="reschedule", - allowed_states=ALLOWED_STATES, - failed_states=FAILED_STATES, - pool="DATA_ENG_EXTERNALTASKSENSOR", - ) - - wait_for_klar_android_derived__metrics_clients_last_seen__v1 = ExternalTaskSensor( - task_id="wait_for_klar_android_derived__metrics_clients_last_seen__v1", - external_dag_id="bqetl_glean_usage", - external_task_id="klar_android.klar_android_derived__metrics_clients_last_seen__v1", - execution_delta=datetime.timedelta(seconds=8100), - check_existence=True, - mode="reschedule", - allowed_states=ALLOWED_STATES, - failed_states=FAILED_STATES, - pool="DATA_ENG_EXTERNALTASKSENSOR", - ) - wait_for_checks__fail_org_mozilla_ios_klar_derived__baseline_clients_last_seen__v1 = ExternalTaskSensor( task_id="wait_for_checks__fail_org_mozilla_ios_klar_derived__baseline_clients_last_seen__v1", external_dag_id="bqetl_glean_usage", @@ -562,37 +538,6 @@ checks__fail_focus_ios_derived__active_users_aggregates__v3 ) - checks__fail_klar_android_derived__active_users_aggregates__v3 = bigquery_dq_check( - task_id="checks__fail_klar_android_derived__active_users_aggregates__v3", - source_table='active_users_aggregates_v3${{ macros.ds_format(macros.ds_add(ds, -1), "%Y-%m-%d", "%Y%m%d") }}', - dataset_id="klar_android_derived", - project_id="moz-fx-data-shared-prod", - is_dq_check_fail=True, - owner="lvargas@mozilla.com", - email=[ - "gkaberere@mozilla.com", - "lvargas@mozilla.com", - "telemetry-alerts@mozilla.com", - ], - depends_on_past=False, - parameters=["submission_date:DATE:{{macros.ds_add(ds, -1)}}"], - retries=0, - ) - - with TaskGroup( - "checks__fail_klar_android_derived__active_users_aggregates__v3_external", - ) as checks__fail_klar_android_derived__active_users_aggregates__v3_external: - ExternalTaskMarker( - task_id="bqetl_search_dashboard__wait_for_checks__fail_klar_android_derived__active_users_aggregates__v3", - external_dag_id="bqetl_search_dashboard", - external_task_id="wait_for_checks__fail_klar_android_derived__active_users_aggregates__v3", - execution_date="{{ (execution_date - macros.timedelta(days=-1, seconds=85500)).isoformat() }}", - ) - - checks__fail_klar_android_derived__active_users_aggregates__v3_external.set_upstream( - checks__fail_klar_android_derived__active_users_aggregates__v3 - ) - checks__fail_klar_ios_derived__active_users_aggregates__v3 = bigquery_dq_check( task_id="checks__fail_klar_ios_derived__active_users_aggregates__v3", source_table='active_users_aggregates_v3${{ macros.ds_format(macros.ds_add(ds, -1), "%Y-%m-%d", "%Y%m%d") }}', @@ -709,23 +654,6 @@ retries=0, ) - checks__warn_klar_android_derived__active_users_aggregates__v3 = bigquery_dq_check( - task_id="checks__warn_klar_android_derived__active_users_aggregates__v3", - source_table='active_users_aggregates_v3${{ macros.ds_format(macros.ds_add(ds, -1), "%Y-%m-%d", "%Y%m%d") }}', - dataset_id="klar_android_derived", - project_id="moz-fx-data-shared-prod", - is_dq_check_fail=False, - owner="lvargas@mozilla.com", - email=[ - "gkaberere@mozilla.com", - "lvargas@mozilla.com", - "telemetry-alerts@mozilla.com", - ], - depends_on_past=False, - parameters=["submission_date:DATE:{{macros.ds_add(ds, -1)}}"], - retries=0, - ) - checks__warn_klar_ios_derived__active_users_aggregates__v3 = bigquery_dq_check( task_id="checks__warn_klar_ios_derived__active_users_aggregates__v3", source_table='active_users_aggregates_v3${{ macros.ds_format(macros.ds_add(ds, -1), "%Y-%m-%d", "%Y%m%d") }}', @@ -837,22 +765,6 @@ parameters=["submission_date:DATE:{{macros.ds_add(ds, -1)}}"], ) - klar_android_active_users_aggregates = bigquery_etl_query( - task_id="klar_android_active_users_aggregates", - destination_table='active_users_aggregates_v3${{ macros.ds_format(macros.ds_add(ds, -1), "%Y-%m-%d", "%Y%m%d") }}', - dataset_id="klar_android_derived", - project_id="moz-fx-data-shared-prod", - owner="lvargas@mozilla.com", - email=[ - "gkaberere@mozilla.com", - "lvargas@mozilla.com", - "telemetry-alerts@mozilla.com", - ], - date_partition_parameter=None, - depends_on_past=False, - parameters=["submission_date:DATE:{{macros.ds_add(ds, -1)}}"], - ) - klar_ios_active_users_aggregates = bigquery_etl_query( task_id="klar_ios_active_users_aggregates", destination_table='active_users_aggregates_v3${{ macros.ds_format(macros.ds_add(ds, -1), "%Y-%m-%d", "%Y%m%d") }}', @@ -945,10 +857,6 @@ focus_ios_active_users_aggregates ) - checks__fail_klar_android_derived__active_users_aggregates__v3.set_upstream( - klar_android_active_users_aggregates - ) - checks__fail_klar_ios_derived__active_users_aggregates__v3.set_upstream( klar_ios_active_users_aggregates ) @@ -997,10 +905,6 @@ focus_ios_active_users_aggregates ) - checks__warn_klar_android_derived__active_users_aggregates__v3.set_upstream( - klar_android_active_users_aggregates - ) - checks__warn_klar_ios_derived__active_users_aggregates__v3.set_upstream( klar_ios_active_users_aggregates ) @@ -1089,14 +993,6 @@ wait_for_focus_ios_derived__metrics_clients_last_seen__v1 ) - klar_android_active_users_aggregates.set_upstream( - wait_for_checks__fail_org_mozilla_klar_derived__baseline_clients_last_seen__v1 - ) - - klar_android_active_users_aggregates.set_upstream( - wait_for_klar_android_derived__metrics_clients_last_seen__v1 - ) - klar_ios_active_users_aggregates.set_upstream( wait_for_checks__fail_org_mozilla_ios_klar_derived__baseline_clients_last_seen__v1 ) diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/dags/bqetl_desktop_conv_evnt_categorization.py /tmp/workspace/generated-sql/dags/bqetl_desktop_conv_evnt_categorization.py --- /tmp/workspace/main-generated-sql/dags/bqetl_desktop_conv_evnt_categorization.py 2024-06-06 18:08:31.000000000 +0000 +++ /tmp/workspace/generated-sql/dags/bqetl_desktop_conv_evnt_categorization.py 2024-06-06 18:23:30.000000000 +0000 @@ -76,18 +76,6 @@ pool="DATA_ENG_EXTERNALTASKSENSOR", ) - wait_for_telemetry_derived__clients_first_seen__v1 = ExternalTaskSensor( - task_id="wait_for_telemetry_derived__clients_first_seen__v1", - external_dag_id="bqetl_main_summary", - external_task_id="telemetry_derived__clients_first_seen__v1", - execution_delta=datetime.timedelta(seconds=36000), - check_existence=True, - mode="reschedule", - allowed_states=ALLOWED_STATES, - failed_states=FAILED_STATES, - pool="DATA_ENG_EXTERNALTASKSENSOR", - ) - checks__warn_google_ads_derived__conversion_event_categorization__v1 = bigquery_dq_check( task_id="checks__warn_google_ads_derived__conversion_event_categorization__v1", source_table='conversion_event_categorization_v1${{ macros.ds_format(macros.ds_add(ds, -14), "%Y-%m-%d", "%Y%m%d") }}', @@ -126,7 +114,3 @@ google_ads_derived__conversion_event_categorization__v1.set_upstream( wait_for_checks__fail_telemetry_derived__clients_last_seen__v2 ) - - google_ads_derived__conversion_event_categorization__v1.set_upstream( - wait_for_telemetry_derived__clients_first_seen__v1 - ) diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/dags/bqetl_glean_usage.py /tmp/workspace/generated-sql/dags/bqetl_glean_usage.py --- /tmp/workspace/main-generated-sql/dags/bqetl_glean_usage.py 2024-06-06 18:08:31.000000000 +0000 +++ /tmp/workspace/generated-sql/dags/bqetl_glean_usage.py 2024-06-06 18:23:32.000000000 +0000 @@ -1191,13 +1191,6 @@ parent_group=task_group_klar_android, ) as checks__fail_org_mozilla_klar_derived__baseline_clients_last_seen__v1_external: ExternalTaskMarker( - task_id="bqetl_analytics_aggregations__wait_for_checks__fail_org_mozilla_klar_derived__baseline_clients_last_seen__v1", - external_dag_id="bqetl_analytics_aggregations", - external_task_id="wait_for_checks__fail_org_mozilla_klar_derived__baseline_clients_last_seen__v1", - execution_date="{{ (execution_date - macros.timedelta(days=-1, seconds=78300)).isoformat() }}", - ) - - ExternalTaskMarker( task_id="bqetl_mobile_kpi_metrics__wait_for_checks__fail_org_mozilla_klar_derived__baseline_clients_last_seen__v1", external_dag_id="bqetl_mobile_kpi_metrics", external_task_id="wait_for_checks__fail_org_mozilla_klar_derived__baseline_clients_last_seen__v1", @@ -2504,21 +2497,6 @@ task_group=task_group_klar_android, ) - with TaskGroup( - "klar_android_derived__metrics_clients_last_seen__v1_external", - parent_group=task_group_klar_android, - ) as klar_android_derived__metrics_clients_last_seen__v1_external: - ExternalTaskMarker( - task_id="bqetl_analytics_aggregations__wait_for_klar_android_derived__metrics_clients_last_seen__v1", - external_dag_id="bqetl_analytics_aggregations", - external_task_id="wait_for_klar_android_derived__metrics_clients_last_seen__v1", - execution_date="{{ (execution_date - macros.timedelta(days=-1, seconds=78300)).isoformat() }}", - ) - - klar_android_derived__metrics_clients_last_seen__v1_external.set_upstream( - klar_android_derived__metrics_clients_last_seen__v1 - ) - klar_ios_derived__clients_last_seen_joined__v1 = bigquery_etl_query( task_id="klar_ios_derived__clients_last_seen_joined__v1", destination_table="clients_last_seen_joined_v1", diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/dags/bqetl_main_summary.py /tmp/workspace/generated-sql/dags/bqetl_main_summary.py --- /tmp/workspace/main-generated-sql/dags/bqetl_main_summary.py 2024-06-06 18:08:31.000000000 +0000 +++ /tmp/workspace/generated-sql/dags/bqetl_main_summary.py 2024-06-06 18:23:26.000000000 +0000 @@ -498,20 +498,6 @@ priority_weight=80, ) - with TaskGroup( - "telemetry_derived__clients_first_seen__v1_external", - ) as telemetry_derived__clients_first_seen__v1_external: - ExternalTaskMarker( - task_id="bqetl_desktop_conv_evnt_categorization__wait_for_telemetry_derived__clients_first_seen__v1", - external_dag_id="bqetl_desktop_conv_evnt_categorization", - external_task_id="wait_for_telemetry_derived__clients_first_seen__v1", - execution_date="{{ (execution_date - macros.timedelta(days=-1, seconds=50400)).isoformat() }}", - ) - - telemetry_derived__clients_first_seen__v1_external.set_upstream( - telemetry_derived__clients_first_seen__v1 - ) - telemetry_derived__clients_last_seen__v1 = bigquery_etl_query( task_id="telemetry_derived__clients_last_seen__v1", destination_table="clients_last_seen_v1", diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/dags/bqetl_search_dashboard.py /tmp/workspace/generated-sql/dags/bqetl_search_dashboard.py --- /tmp/workspace/main-generated-sql/dags/bqetl_search_dashboard.py 2024-06-06 18:08:31.000000000 +0000 +++ /tmp/workspace/generated-sql/dags/bqetl_search_dashboard.py 2024-06-06 18:23:27.000000000 +0000 @@ -121,18 +121,6 @@ pool="DATA_ENG_EXTERNALTASKSENSOR", ) - wait_for_checks__fail_klar_android_derived__active_users_aggregates__v3 = ExternalTaskSensor( - task_id="wait_for_checks__fail_klar_android_derived__active_users_aggregates__v3", - external_dag_id="bqetl_analytics_aggregations", - external_task_id="checks__fail_klar_android_derived__active_users_aggregates__v3", - execution_delta=datetime.timedelta(seconds=900), - check_existence=True, - mode="reschedule", - allowed_states=ALLOWED_STATES, - failed_states=FAILED_STATES, - pool="DATA_ENG_EXTERNALTASKSENSOR", - ) - wait_for_checks__fail_klar_ios_derived__active_users_aggregates__v3 = ExternalTaskSensor( task_id="wait_for_checks__fail_klar_ios_derived__active_users_aggregates__v3", external_dag_id="bqetl_analytics_aggregations", @@ -259,10 +247,6 @@ ) search_derived__search_revenue_levers_daily__v1.set_upstream( - wait_for_checks__fail_klar_android_derived__active_users_aggregates__v3 - ) - - search_derived__search_revenue_levers_daily__v1.set_upstream( wait_for_checks__fail_klar_ios_derived__active_users_aggregates__v3 ) Only in /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/klar_android: active_users_aggregates Only in /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/klar_android_derived: active_users_aggregates_v3 diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/mozfun/glam/histogram_cast_json/udf.sql /tmp/workspace/generated-sql/sql/mozfun/glam/histogram_cast_json/udf.sql --- /tmp/workspace/main-generated-sql/sql/mozfun/glam/histogram_cast_json/udf.sql 2024-06-06 18:04:55.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/mozfun/glam/histogram_cast_json/udf.sql 2024-06-06 18:05:57.000000000 +0000 @@ -1,17 +1,27 @@ --- udf_js_flatten +/* +Casts an ARRAY> histogram to a JSON string. +This implementation uses String concatenation instead of +BigQuery native JSON (TO_JSON / JSON_OBJECT) functions to +preserve order. +https://cloud.google.com/bigquery/docs/reference/standard-sql/json_functions#json_object +Order is important for GLAM histograms so other UDFs that +operate on them, such as glam.percentile, can work correctly. +*/ CREATE OR REPLACE FUNCTION glam.histogram_cast_json( histogram ARRAY> ) -RETURNS STRING DETERMINISTIC -LANGUAGE js -AS - ''' - let obj = {}; - histogram.map(r => { - obj[r.key] = parseFloat(r.value.toFixed(4)); - }); - return JSON.stringify(obj); -'''; +RETURNS STRING AS ( + ( + SELECT + CONCAT( + '{', + STRING_AGG(CONCAT('"', key, '":', ROUND(value, 4)) ORDER BY CAST(key AS FLOAT64)), + '}' + ) + FROM + UNNEST(histogram) + ) +); SELECT assert.equals( @@ -19,4 +29,16 @@ glam.histogram_cast_json( ARRAY>[("0", 0.111111), ("1", 2.0 / 3), ("2", 0)] ) + ), + assert.equals( + '{"0":0.1111,"1":0.6667,"2":0,"10":100}', + glam.histogram_cast_json( + ARRAY>[ + ("0", 0.111111), + ("1", 2.0 / 3), + ("10", 100), + ("2", 0) + ] ) + ), + assert.equals('{}', glam.histogram_cast_json(ARRAY>[])), diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/mozfun/glam/histogram_generate_functional_buckets/udf.sql /tmp/workspace/generated-sql/sql/mozfun/glam/histogram_generate_functional_buckets/udf.sql --- /tmp/workspace/main-generated-sql/sql/mozfun/glam/histogram_generate_functional_buckets/udf.sql 2024-06-06 18:04:55.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/mozfun/glam/histogram_generate_functional_buckets/udf.sql 2024-06-06 18:05:57.000000000 +0000 @@ -4,34 +4,29 @@ buckets_per_magnitude INT64, range_max INT64 ) -RETURNS ARRAY DETERMINISTIC -LANGUAGE js -AS - ''' - function sample_to_bucket_index(sample) { - // Get the index of the sample - // https://github.com/mozilla/glean/blob/main/glean-core/src/histogram/functional.rs - let exponent = Math.pow(log_base, 1.0/buckets_per_magnitude); - return Math.ceil(Math.log(sample + 1) / Math.log(exponent)); - } - - let buckets = new Set([0]); - for (let index = 0; index < sample_to_bucket_index(range_max); index++) { - - // Avoid re-using the exponent due to floating point issues when carrying - // the `pow` operation e.g. `let exponent = ...; Math.pow(exponent, index)`. - let bucket = Math.floor(Math.pow(log_base, index/buckets_per_magnitude)); - - // NOTE: the sample_to_bucket_index implementation overshoots the true index, - // so we break out early if we hit the max bucket range. - if (bucket > range_max) { - break; - } - buckets.add(bucket); - } - - return [...buckets] -'''; +RETURNS ARRAY AS ( + ( + WITH bucket_indexes AS ( + -- Generate all bucket indexes + -- https://github.com/mozilla/glean/blob/main/glean-core/src/histogram/functional.rs + SELECT + GENERATE_ARRAY(0, CEIL(LOG(range_max + 1, log_base) * buckets_per_magnitude)) AS indexes + ), + buckets AS ( + SELECT + FLOOR(POW(log_base, (idx) / buckets_per_magnitude)) AS bucket + FROM + bucket_indexes, + UNNEST(indexes) AS idx + WHERE + FLOOR(POW(log_base, (idx) / buckets_per_magnitude)) <= range_max + ) + SELECT + ARRAY_CONCAT([0.0], ARRAY_AGG(DISTINCT(bucket) ORDER BY bucket)) + FROM + buckets + ) +); SELECT -- First 50 keys of a timing distribution diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/mozfun/glam/histogram_generate_linear_buckets/udf.sql /tmp/workspace/generated-sql/sql/mozfun/glam/histogram_generate_linear_buckets/udf.sql --- /tmp/workspace/main-generated-sql/sql/mozfun/glam/histogram_generate_linear_buckets/udf.sql 2024-06-06 18:04:55.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/mozfun/glam/histogram_generate_linear_buckets/udf.sql 2024-06-06 18:05:57.000000000 +0000 @@ -4,17 +4,17 @@ max FLOAT64, nBuckets FLOAT64 ) -RETURNS ARRAY DETERMINISTIC -LANGUAGE js -AS - ''' - let result = [0]; - for (let i = 1; i < Math.min(nBuckets, max, 10000); i++) { - let linearRange = (min * (nBuckets - 1 - i) + max * (i - 1)) / (nBuckets - 2); - result.push(Math.round(linearRange)); - } - return result; -'''; +RETURNS ARRAY AS ( + ARRAY_CONCAT( + [0.0], + ARRAY( + SELECT + ROUND((min * (nBuckets - 1 - i) + max * (i - 1)) / (nBuckets - 2)) + FROM + UNNEST(GENERATE_ARRAY(1, LEAST(nBuckets - 1, max, 10000))) AS i + ) + ) +); SELECT -- Buckets of CONTENT_FRAME_TIME_VSYNC diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/mozfun/glam/histogram_generate_scalar_buckets/udf.sql /tmp/workspace/generated-sql/sql/mozfun/glam/histogram_generate_scalar_buckets/udf.sql --- /tmp/workspace/main-generated-sql/sql/mozfun/glam/histogram_generate_scalar_buckets/udf.sql 2024-06-06 18:04:55.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/mozfun/glam/histogram_generate_scalar_buckets/udf.sql 2024-06-06 18:05:57.000000000 +0000 @@ -4,17 +4,14 @@ max_bucket FLOAT64, num_buckets INT64 ) -RETURNS ARRAY DETERMINISTIC -LANGUAGE js -AS - ''' - let bucket_size = (max_bucket - min_bucket) / num_buckets; - let buckets = new Set(); - for (let bucket = min_bucket; bucket < max_bucket; bucket += bucket_size) { - buckets.add(Math.pow(2, bucket).toFixed(2)); - } - return Array.from(buckets); -'''; +RETURNS ARRAY AS ( + ARRAY( + SELECT + ROUND(POW(2, (max_bucket - min_bucket) / num_buckets * val), 2) + FROM + UNNEST(GENERATE_ARRAY(0, num_buckets - 1)) AS val + ) +); SELECT assert.array_equals([1, 2, 4, 8], glam.histogram_generate_scalar_buckets(0, LOG(16, 2), 4)), diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/mozfun/glam/percentile/udf.sql /tmp/workspace/generated-sql/sql/mozfun/glam/percentile/udf.sql --- /tmp/workspace/main-generated-sql/sql/mozfun/glam/percentile/udf.sql 2024-06-06 18:04:55.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/mozfun/glam/percentile/udf.sql 2024-06-06 18:05:57.000000000 +0000 @@ -1,37 +1,41 @@ -- udf_js.glean_percentile CREATE OR REPLACE FUNCTION glam.percentile( - percentile FLOAT64, + pct FLOAT64, histogram ARRAY>, type STRING ) -RETURNS FLOAT64 DETERMINISTIC -LANGUAGE js -AS - ''' - if (percentile < 0 || percentile > 100) { - throw "percentile must be a value between 0 and 100"; - } - - let values = histogram.map(bucket => bucket.value); - let total = values.reduce((a, b) => a + b); - let normalized = values.map(value => value / total); - - // Find the index into the cumulative distribution function that corresponds - // to the percentile. This undershoots the true value of the percentile. - let acc = 0; - let index = null; - for (let i = 0; i < normalized.length; i++) { - acc += normalized[i]; - index = i; - if (acc >= percentile / 100) { - break; - } - } - - // NOTE: we do not perform geometric or linear interpolation, but this would - // be the place to implement it. - return histogram[index].key; -'''; +RETURNS FLOAT64 AS ( + ( + WITH check AS ( + SELECT + IF( + pct >= 0 + AND pct <= 100, + TRUE, + ERROR('percentile must be a value between 0 and 100') + ) pct_ok + ), + keyed_cum_sum AS ( + SELECT + key, + SUM(value) OVER (ORDER BY CAST(key AS FLOAT64)) / SUM(value) OVER () AS cum_sum + FROM + UNNEST(histogram) + ) + SELECT + CAST(key AS FLOAT64) + FROM + keyed_cum_sum, + check + WHERE + check.pct_ok + AND cum_sum >= pct / 100 + ORDER BY + cum_sum + LIMIT + 1 + ) +); SELECT assert.equals( @@ -41,6 +45,30 @@ ARRAY>[("0", 1), ("2", 2), ("3", 1)], "timing_distribution" ) + ), + assert.equals( + 3, + glam.percentile( + 100.0, + ARRAY>[("0", 1), ("2", 2), ("3", 1)], + "timing_distribution" + ) + ), + assert.equals( + 0, + glam.percentile( + 0.0, + ARRAY>[("0", 1), ("2", 2), ("3", 1)], + "timing_distribution" + ) + ), + assert.equals( + 2, + glam.percentile( + 2.0, + ARRAY>[("0", 1), ("2", 2), ("10", 10), ("11", 100)], + "timing_distribution" + ) ); #xfail diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/bedrock_derived/event_monitoring_live_v1/materialized_view.sql /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/bedrock_derived/event_monitoring_live_v1/materialized_view.sql --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/bedrock_derived/event_monitoring_live_v1/materialized_view.sql 2024-06-06 18:05:31.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/bedrock_derived/event_monitoring_live_v1/materialized_view.sql 2024-06-06 18:07:53.000000000 +0000 @@ -60,7 +60,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.bedrock_live.non_interaction_v1` + `moz-fx-data-shared-prod.bedrock_live.events_v1` UNION ALL SELECT submission_timestamp, @@ -70,7 +70,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.bedrock_live.events_v1` + `moz-fx-data-shared-prod.bedrock_live.non_interaction_v1` ) CROSS JOIN UNNEST(events) AS event, diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/contextual_services/event_aggregates/schema.yaml /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/contextual_services/event_aggregates/schema.yaml --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/contextual_services/event_aggregates/schema.yaml 2024-06-06 18:04:54.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/contextual_services/event_aggregates/schema.yaml 2024-06-06 18:13:57.000000000 +0000 @@ -1,49 +1,49 @@ fields: -- mode: NULLABLE - name: submission_date +- name: submission_date type: DATE -- mode: NULLABLE - name: source + mode: NULLABLE +- name: source type: STRING -- mode: NULLABLE - name: event_type + mode: NULLABLE +- name: event_type type: STRING -- mode: NULLABLE - name: form_factor + mode: NULLABLE +- name: form_factor type: STRING -- mode: NULLABLE - name: country + mode: NULLABLE +- name: country type: STRING -- mode: NULLABLE - name: subdivision1 + mode: NULLABLE +- name: subdivision1 type: STRING -- mode: NULLABLE - name: advertiser + mode: NULLABLE +- name: advertiser type: STRING -- mode: NULLABLE - name: release_channel + mode: NULLABLE +- name: release_channel type: STRING -- mode: NULLABLE - name: position + mode: NULLABLE +- name: position type: INTEGER -- mode: NULLABLE - name: provider + mode: NULLABLE +- name: provider type: STRING -- mode: NULLABLE - name: match_type + mode: NULLABLE +- name: match_type type: STRING -- mode: NULLABLE - name: normalized_os + mode: NULLABLE +- name: normalized_os type: STRING -- mode: NULLABLE - name: suggest_data_sharing_enabled + mode: NULLABLE +- name: suggest_data_sharing_enabled type: BOOLEAN -- mode: NULLABLE - name: event_count + mode: NULLABLE +- name: event_count type: INTEGER -- mode: NULLABLE - name: user_count + mode: NULLABLE +- name: user_count type: INTEGER -- mode: NULLABLE - name: query_type + mode: NULLABLE +- name: query_type type: STRING + mode: NULLABLE diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/contextual_services/event_aggregates_suggest/schema.yaml /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/contextual_services/event_aggregates_suggest/schema.yaml --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/contextual_services/event_aggregates_suggest/schema.yaml 2024-06-06 18:04:54.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/contextual_services/event_aggregates_suggest/schema.yaml 2024-06-06 18:13:57.000000000 +0000 @@ -1,40 +1,40 @@ fields: -- mode: NULLABLE - name: submission_date +- name: submission_date type: DATE -- mode: NULLABLE - name: form_factor + mode: NULLABLE +- name: form_factor type: STRING -- mode: NULLABLE - name: country + mode: NULLABLE +- name: country type: STRING -- mode: NULLABLE - name: advertiser + mode: NULLABLE +- name: advertiser type: STRING -- mode: NULLABLE - name: normalized_os + mode: NULLABLE +- name: normalized_os type: STRING -- mode: NULLABLE - name: release_channel + mode: NULLABLE +- name: release_channel type: STRING -- mode: NULLABLE - name: position + mode: NULLABLE +- name: position type: INTEGER -- mode: NULLABLE - name: provider + mode: NULLABLE +- name: provider type: STRING -- mode: NULLABLE - name: match_type + mode: NULLABLE +- name: match_type type: STRING -- mode: NULLABLE - name: suggest_data_sharing_enabled + mode: NULLABLE +- name: suggest_data_sharing_enabled type: BOOLEAN -- mode: NULLABLE - name: impression_count + mode: NULLABLE +- name: impression_count type: INTEGER -- mode: NULLABLE - name: click_count + mode: NULLABLE +- name: click_count type: INTEGER -- mode: NULLABLE - name: query_type + mode: NULLABLE +- name: query_type type: STRING + mode: NULLABLE diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/fenix/funnel_retention_clients/schema.yaml /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/fenix/funnel_retention_clients/schema.yaml --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/fenix/funnel_retention_clients/schema.yaml 2024-06-06 18:04:54.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/fenix/funnel_retention_clients/schema.yaml 2024-06-06 18:17:12.000000000 +0000 @@ -26,6 +26,9 @@ - name: adjust_network type: STRING mode: NULLABLE +- name: install_source + type: STRING + mode: NULLABLE - name: retained_week_2 type: BOOLEAN mode: NULLABLE diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/fenix/funnel_retention_week_4/schema.yaml /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/fenix/funnel_retention_week_4/schema.yaml --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/fenix/funnel_retention_week_4/schema.yaml 2024-06-06 18:04:54.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/fenix/funnel_retention_week_4/schema.yaml 2024-06-06 18:17:12.000000000 +0000 @@ -48,6 +48,10 @@ description: 'The type of source of a client installation. ' +- name: install_source + type: STRING + mode: NULLABLE + description: null - name: new_profiles type: INTEGER mode: NULLABLE diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/firefox_desktop_background_tasks_derived/event_monitoring_live_v1/materialized_view.sql /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/firefox_desktop_background_tasks_derived/event_monitoring_live_v1/materialized_view.sql --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/firefox_desktop_background_tasks_derived/event_monitoring_live_v1/materialized_view.sql 2024-06-06 18:05:31.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/firefox_desktop_background_tasks_derived/event_monitoring_live_v1/materialized_view.sql 2024-06-06 18:07:54.000000000 +0000 @@ -50,7 +50,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.firefox_desktop_background_tasks_live.events_v1` + `moz-fx-data-shared-prod.firefox_desktop_background_tasks_live.background_tasks_v1` UNION ALL SELECT submission_timestamp, @@ -60,7 +60,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.firefox_desktop_background_tasks_live.background_tasks_v1` + `moz-fx-data-shared-prod.firefox_desktop_background_tasks_live.events_v1` ) CROSS JOIN UNNEST(events) AS event, diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/firefox_desktop_derived/event_monitoring_live_v1/materialized_view.sql /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/firefox_desktop_derived/event_monitoring_live_v1/materialized_view.sql --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/firefox_desktop_derived/event_monitoring_live_v1/materialized_view.sql 2024-06-06 18:05:31.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/firefox_desktop_derived/event_monitoring_live_v1/materialized_view.sql 2024-06-06 18:07:54.000000000 +0000 @@ -50,7 +50,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.firefox_desktop_live.events_v1` + `moz-fx-data-shared-prod.firefox_desktop_live.newtab_v1` UNION ALL SELECT submission_timestamp, @@ -60,7 +60,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.firefox_desktop_live.urlbar_potential_exposure_v1` + `moz-fx-data-shared-prod.firefox_desktop_live.events_v1` UNION ALL SELECT submission_timestamp, @@ -70,7 +70,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.firefox_desktop_live.newtab_v1` + `moz-fx-data-shared-prod.firefox_desktop_live.prototype_no_code_events_v1` UNION ALL SELECT submission_timestamp, @@ -80,7 +80,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.firefox_desktop_live.prototype_no_code_events_v1` + `moz-fx-data-shared-prod.firefox_desktop_live.urlbar_potential_exposure_v1` ) CROSS JOIN UNNEST(events) AS event, diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/google_ads/conversion_event_categorization/schema.yaml /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/google_ads/conversion_event_categorization/schema.yaml --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/google_ads/conversion_event_categorization/schema.yaml 2024-06-06 18:05:31.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/google_ads/conversion_event_categorization/schema.yaml 2024-06-06 18:14:01.000000000 +0000 @@ -34,35 +34,24 @@ - name: first_main_ping_date type: DATE mode: NULLABLE - description: First Main Ping Date - name: country type: STRING mode: NULLABLE - description: Country - name: dou type: INTEGER mode: NULLABLE - description: DOU - name: active_hours_sum type: FLOAT mode: NULLABLE - description: Active Hours Sum - name: search_with_ads_count_all type: INTEGER mode: NULLABLE - description: Search With Ads Count All - name: event_1 type: BOOLEAN mode: NULLABLE - description: Event 1 Indicator - 5 or more days of use and 1 or more search with - ads (strictest event) - name: event_2 type: BOOLEAN mode: NULLABLE - description: Event 2 Indicator - 3 or more days of use and 1 or more search with - ads (medium event) - name: event_3 type: BOOLEAN mode: NULLABLE - description: Event 3 Indicator - 3 or more days of use and 0.4 or more active hours - (most lenient event) diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/google_ads_derived/conversion_event_categorization_v1/query.sql /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/google_ads_derived/conversion_event_categorization_v1/query.sql --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/google_ads_derived/conversion_event_categorization_v1/query.sql 2024-06-06 18:04:55.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/google_ads_derived/conversion_event_categorization_v1/query.sql 2024-06-06 18:05:57.000000000 +0000 @@ -3,35 +3,27 @@ --Note: Max cohort date cannot be more than 7 days ago (to ensure we always have at least 7 days of data) WITH clients_first_seen_14_days_ago AS ( SELECT - cfs.client_id, - cfs.first_seen_date, - m.first_seen_date AS first_main_ping_date, - cfs.country, - cfs.attribution_campaign, - cfs.attribution_content, - cfs.attribution_dltoken, - cfs.attribution_medium, - cfs.attribution_source + client_id, + first_seen_date, + country, + attribution_campaign, + attribution_content, + attribution_dltoken, + attribution_medium, + attribution_source FROM - `moz-fx-data-shared-prod.telemetry.clients_first_seen` cfs --contains all new clients, including those that never sent a main ping - LEFT JOIN - `moz-fx-data-shared-prod.telemetry_derived.clients_first_seen_v1` m -- the "old" CFS table, contains the date of the client's *first main ping* - ON cfs.client_id = m.client_id - AND m.first_seen_date - -- join so that we only get "first main ping" dates from clients that sent their first main ping within -1 and +6 days from their first_seen_date. - -- we will miss ~5% of clients that send their first main ping later, this is a trade-off we make to have a two-week reporting cadence (one week to send their first main ping, then we report on the outcomes *one week after that* - BETWEEN DATE_SUB(cfs.first_seen_date, INTERVAL 1 DAY) - AND DATE_ADD(cfs.first_seen_date, INTERVAL 6 DAY) + `moz-fx-data-shared-prod.telemetry.clients_first_seen` --contains all new clients, including those that never sent a main ping WHERE - cfs.first_seen_date = @report_date --this is 14 days before {{ds}} - AND cfs.first_seen_date >= '2023-11-01' + first_seen_date = @report_date --this is 14 days before {{ds}} + AND first_seen_date + BETWEEN '2023-11-01' + AND DATE_SUB(CURRENT_DATE, INTERVAL 8 DAY) ), --Step 2: Get only the columns we need from clients last seen, for only the small window of time we need clients_last_seen_raw AS ( SELECT cls.client_id, cls.first_seen_date, - clients.first_main_ping_date, cls.country, cls.submission_date, cls.days_since_seen, @@ -45,11 +37,14 @@ clients_first_seen_14_days_ago clients ON cls.client_id = clients.client_id WHERE - cls.submission_date - -- join the clients_last_seen so that we get the first 7 days of each client's main ping records (for the clients that sent > 0 main pings in their first week) - BETWEEN clients.first_main_ping_date - AND DATE_ADD(clients.first_main_ping_date, INTERVAL 6 DAY) - AND cls.submission_date >= DATE_SUB(@report_date, INTERVAL 1 DAY) + cls.submission_date >= '2023-11-01' --first cohort date + AND cls.submission_date + BETWEEN cls.first_seen_date + AND DATE_ADD(cls.first_seen_date, INTERVAL 6 DAY) --get first 7 days from their first main ping + --to process less data, we only check for pings between @submission date - 15 days and submission date + 15 days for each date this runs + AND cls.submission_date + BETWEEN DATE_SUB(@report_date, INTERVAL 1 DAY) --15 days before DS + AND DATE_ADD(@report_date, INTERVAL 29 DAY) --15 days after DS ), --STEP 2: For every client, get the first 7 days worth of main pings sent after their first main ping client_activity_first_7_days AS ( @@ -60,13 +55,13 @@ ) AS first_seen_date, --date we got first main ping (potentially different than above first seen date) ANY_VALUE( CASE - WHEN first_main_ping_date = submission_date + WHEN first_seen_date = submission_date THEN country END ) AS country, --any country from their first day in clients_last_seen ANY_VALUE( CASE - WHEN first_main_ping_date = DATE_ADD(first_seen_date, INTERVAL 6 DAY) + WHEN submission_date = DATE_ADD(first_seen_date, INTERVAL 6 DAY) THEN BIT_COUNT(days_visited_1_uri_bits & days_interacted_bits) END ) AS dou, --total # of days of activity during their first 7 days of main pings @@ -100,7 +95,7 @@ cfs.attribution_dltoken, cfs.attribution_medium, cfs.attribution_source, - cfs.first_main_ping_date, + IF(cls.first_seen_date IS NOT NULL, TRUE, FALSE) AS sent_main_ping_in_first_7_days, COALESCE( cls.country, cfs.country @@ -123,7 +118,7 @@ attribution_medium, attribution_source, @submission_date AS report_date, - first_main_ping_date, + sent_main_ping_in_first_7_days, country, dou, active_hours_sum, diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/google_ads_derived/conversion_event_categorization_v1/schema.yaml /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/google_ads_derived/conversion_event_categorization_v1/schema.yaml --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/google_ads_derived/conversion_event_categorization_v1/schema.yaml 2024-06-06 18:04:55.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/google_ads_derived/conversion_event_categorization_v1/schema.yaml 2024-06-06 18:05:57.000000000 +0000 @@ -32,9 +32,9 @@ type: DATE description: Report Date - mode: NULLABLE - name: first_main_ping_date - type: DATE - description: First Main Ping Date + name: sent_main_ping_in_first_7_days + type: BOOLEAN + description: Sent Main Ping In First 7 Days After First Seen Date Indicator - mode: NULLABLE name: country type: STRING diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/internet_outages/global_outages_v1/schema.yaml /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/internet_outages/global_outages_v1/schema.yaml --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/internet_outages/global_outages_v1/schema.yaml 2024-06-06 18:04:54.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/internet_outages/global_outages_v1/schema.yaml 2024-06-06 18:14:19.000000000 +0000 @@ -1,49 +1,49 @@ fields: -- mode: NULLABLE - name: country +- name: country type: STRING -- mode: NULLABLE - name: city + mode: NULLABLE +- name: city type: STRING -- mode: NULLABLE - name: datetime + mode: NULLABLE +- name: datetime type: TIMESTAMP -- mode: NULLABLE - name: proportion_undefined + mode: NULLABLE +- name: proportion_undefined type: FLOAT -- mode: NULLABLE - name: proportion_timeout + mode: NULLABLE +- name: proportion_timeout type: FLOAT -- mode: NULLABLE - name: proportion_abort + mode: NULLABLE +- name: proportion_abort type: FLOAT -- mode: NULLABLE - name: proportion_unreachable + mode: NULLABLE +- name: proportion_unreachable type: FLOAT -- mode: NULLABLE - name: proportion_terminated + mode: NULLABLE +- name: proportion_terminated type: FLOAT -- mode: NULLABLE - name: proportion_channel_open + mode: NULLABLE +- name: proportion_channel_open type: FLOAT -- mode: NULLABLE - name: avg_dns_success_time + mode: NULLABLE +- name: avg_dns_success_time type: FLOAT -- mode: NULLABLE - name: missing_dns_success + mode: NULLABLE +- name: missing_dns_success type: FLOAT -- mode: NULLABLE - name: avg_dns_failure_time + mode: NULLABLE +- name: avg_dns_failure_time type: FLOAT -- mode: NULLABLE - name: missing_dns_failure + mode: NULLABLE +- name: missing_dns_failure type: FLOAT -- mode: NULLABLE - name: count_dns_failure + mode: NULLABLE +- name: count_dns_failure type: FLOAT -- mode: NULLABLE - name: ssl_error_prop + mode: NULLABLE +- name: ssl_error_prop type: FLOAT -- mode: NULLABLE - name: avg_tls_handshake_time + mode: NULLABLE +- name: avg_tls_handshake_time type: FLOAT + mode: NULLABLE diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/klar_android/active_users_aggregates/metadata.yaml /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/klar_android/active_users_aggregates/metadata.yaml --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/klar_android/active_users_aggregates/metadata.yaml 2024-06-06 18:05:32.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/klar_android/active_users_aggregates/metadata.yaml 1970-01-01 00:00:00.000000000 +0000 @@ -1,13 +0,0 @@ -friendly_name: Active Users Aggregates -description: |- - Please provide a description for the query -owners: [] -labels: {} -bigquery: null -workgroup_access: -- role: roles/bigquery.dataViewer - members: - - workgroup:mozilla-confidential -references: - view.sql: - - moz-fx-data-shared-prod.klar_android_derived.active_users_aggregates_v3 diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/klar_android/active_users_aggregates/view.sql /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/klar_android/active_users_aggregates/view.sql --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/klar_android/active_users_aggregates/view.sql 2024-06-06 18:05:32.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/klar_android/active_users_aggregates/view.sql 1970-01-01 00:00:00.000000000 +0000 @@ -1,15 +0,0 @@ ---- User-facing view. Generated via sql_generators.active_users. -CREATE OR REPLACE VIEW - `moz-fx-data-shared-prod.klar_android.active_users_aggregates` -AS -SELECT - * EXCEPT (app_version, app_name), - app_name, - app_version, - `mozfun.norm.browser_version_info`(app_version).major_version AS app_version_major, - `mozfun.norm.browser_version_info`(app_version).minor_version AS app_version_minor, - `mozfun.norm.browser_version_info`(app_version).patch_revision AS app_version_patch_revision, - `mozfun.norm.browser_version_info`(app_version).is_major_release AS app_version_is_major_release, - `mozfun.norm.os`(os) AS os_grouped -FROM - `moz-fx-data-shared-prod.klar_android_derived.active_users_aggregates_v3` diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/klar_android_derived/active_users_aggregates_v3/checks.sql /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/klar_android_derived/active_users_aggregates_v3/checks.sql --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/klar_android_derived/active_users_aggregates_v3/checks.sql 2024-06-06 18:05:32.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/klar_android_derived/active_users_aggregates_v3/checks.sql 1970-01-01 00:00:00.000000000 +0000 @@ -1,69 +0,0 @@ - - -#warn -WITH daily_users_sum AS ( - SELECT - SUM(daily_users), - FROM - `{{ project_id }}.{{ dataset_id }}.{{ table_name }}` - WHERE - submission_date = @submission_date - ), -distinct_client_count_base AS ( - SELECT - COUNT(DISTINCT client_info.client_id) AS distinct_client_count, - FROM - `moz-fx-data-shared-prod.org_mozilla_klar_live.baseline_v1` - WHERE - DATE(submission_timestamp) = @submission_date - ), -distinct_client_count AS ( - SELECT - SUM(distinct_client_count) - FROM - distinct_client_count_base -) -SELECT - IF( - ABS((SELECT * FROM daily_users_sum) - (SELECT * FROM distinct_client_count)) > 10, - ERROR( - CONCAT( - "Daily users mismatch between the klar_android live across all channels (`moz-fx-data-shared-prod.org_mozilla_klar_live.baseline_v1`,) and active_users_aggregates (`{{ dataset_id }}.{{ table_name }}`) tables is greater than 10.", - " Live table count: ", - (SELECT * FROM distinct_client_count), - " | active_users_aggregates (daily_users): ", - (SELECT * FROM daily_users_sum), - " | Delta detected: ", - ABS((SELECT * FROM daily_users_sum) - (SELECT * FROM distinct_client_count)) - ) - ), - NULL - ); - -#fail -WITH dau_current AS ( - SELECT - SUM(dau) AS dau - FROM - `{{ project_id }}.{{ dataset_id }}.{{ table_name }}` - WHERE - submission_date = @submission_date -), -dau_previous AS ( - SELECT - SUM(dau) AS dau - FROM - `{{ project_id }}.{{ dataset_id }}.{{ table_name }}` - WHERE - submission_date = DATE_SUB(@submission_date, INTERVAL 1 DAY) -) -SELECT - IF( - ABS((SELECT SUM(dau) FROM dau_current) / (SELECT SUM(dau) FROM dau_previous)) > 1.5, - ERROR( - "Current date's DAU is 50% higher than in previous date. See source table (`{{ project_id }}.{{ dataset_id }}.{{ table_name }}`)!" - ), - NULL - ); - - diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/klar_android_derived/active_users_aggregates_v3/metadata.yaml /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/klar_android_derived/active_users_aggregates_v3/metadata.yaml --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/klar_android_derived/active_users_aggregates_v3/metadata.yaml 2024-06-06 18:05:32.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/klar_android_derived/active_users_aggregates_v3/metadata.yaml 1970-01-01 00:00:00.000000000 +0000 @@ -1,54 +0,0 @@ -friendly_name: Klar Android Active Users Aggregates -description: |- - This table contains daily/weekly/monthly active users, - new profiles, searches and ad_clicks for Klar Android, - aggregated by submission_date, attribution, channel, - country, city, device model, distribution_id, os details - and activity segment. - - - dau is counting the users who reported a ping on the date and - are qualified as active users. - - daily_users counts all the users who reported a ping on the date. - Only dau is exposed in the view telemetry.active_users_aggregates. - - The table is labeled as "change_controlled", which implies - that changes require the approval of at least one owner. - - Proposal: - https://docs.google.com/document/d/1qvWO49Lr_Z_WErh3I3058A3B1YuiuURx19K3aTdmejM/edit?usp=sharing -owners: -- lvargas@mozilla.com -- mozilla/kpi_table_reviewers -labels: - incremental: true - change_controlled: true - dag: bqetl_analytics_aggregations - owner1: lvargas -scheduling: - dag_name: bqetl_analytics_aggregations - task_name: klar_android_active_users_aggregates - date_partition_offset: -1 -bigquery: - time_partitioning: - type: day - field: submission_date - require_partition_filter: true - expiration_days: null - range_partitioning: null - clustering: - fields: - - country - - app_name - - attribution_medium - - channel -workgroup_access: -- role: roles/bigquery.dataViewer - members: - - workgroup:mozilla-confidential -references: - checks.sql: - - .. - - moz-fx-data-shared-prod.org_mozilla_klar_live.baseline_v1 - query.sql: - - moz-fx-data-shared-prod.klar_android.active_users - - moz-fx-data-shared-prod.klar_android.metrics_clients_last_seen diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/klar_android_derived/active_users_aggregates_v3/query.sql /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/klar_android_derived/active_users_aggregates_v3/query.sql --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/klar_android_derived/active_users_aggregates_v3/query.sql 2024-06-06 18:05:32.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/klar_android_derived/active_users_aggregates_v3/query.sql 1970-01-01 00:00:00.000000000 +0000 @@ -1,170 +0,0 @@ ---- Query generated via sql_generators.active_users. -WITH baseline AS ( - SELECT - submission_date, - normalized_channel, - client_id, - days_active_bits, - days_created_profile_bits, - normalized_os, - normalized_os_version, - locale, - city, - country, - app_display_version, - device_model, - first_seen_date, - submission_date = first_seen_date AS is_new_profile, - CAST(NULL AS string) AS distribution_id, - isp, - app_name, - activity_segment AS segment, - is_daily_user, - is_weekly_user, - is_monthly_user, - is_dau, - is_wau, - is_mau - FROM - `moz-fx-data-shared-prod.klar_android.active_users` - WHERE - submission_date = @submission_date -), -metrics AS ( - -- Metrics ping may arrive in the same or next day as the baseline ping. - SELECT - client_id, - ARRAY_AGG(normalized_channel IGNORE NULLS ORDER BY submission_date ASC)[ - SAFE_OFFSET(0) - ] AS normalized_channel, - CAST(NULL AS INTEGER) AS uri_count, - CAST(NULL AS INTEGER) AS is_default_browser, - FROM - `moz-fx-data-shared-prod.klar_android.metrics_clients_last_seen` - WHERE - DATE(submission_date) - BETWEEN @submission_date - AND DATE_ADD(@submission_date, INTERVAL 1 DAY) - GROUP BY - client_id -), -unioned AS ( - SELECT - baseline.client_id, - baseline.segment, - baseline.app_name, - baseline.app_display_version AS app_version, - baseline.normalized_channel, - IFNULL(baseline.country, '??') country, - baseline.city, - baseline.days_created_profile_bits, - baseline.device_model, - baseline.isp, - baseline.is_new_profile, - baseline.locale, - baseline.first_seen_date, - baseline.normalized_os, - baseline.normalized_os_version, - COALESCE( - SAFE_CAST(NULLIF(SPLIT(baseline.normalized_os_version, ".")[SAFE_OFFSET(0)], "") AS INTEGER), - 0 - ) AS os_version_major, - COALESCE( - SAFE_CAST(NULLIF(SPLIT(baseline.normalized_os_version, ".")[SAFE_OFFSET(1)], "") AS INTEGER), - 0 - ) AS os_version_minor, - COALESCE( - SAFE_CAST(NULLIF(SPLIT(baseline.normalized_os_version, ".")[SAFE_OFFSET(2)], "") AS INTEGER), - 0 - ) AS os_version_patch, - baseline.submission_date, - metrics.uri_count, - metrics.is_default_browser, - baseline.distribution_id, - CAST(NULL AS string) AS attribution_content, - CAST(NULL AS string) AS attribution_source, - CAST(NULL AS string) AS attribution_medium, - CAST(NULL AS string) AS attribution_campaign, - CAST(NULL AS string) AS attribution_experiment, - CAST(NULL AS string) AS attribution_variation, - CAST(NULL AS FLOAT64) AS active_hours_sum, - is_daily_user, - is_weekly_user, - is_monthly_user, - is_dau, - is_wau, - is_mau - FROM - baseline - LEFT JOIN - metrics - ON baseline.client_id = metrics.client_id - AND baseline.normalized_channel IS NOT DISTINCT FROM metrics.normalized_channel -), -unioned_with_attribution AS ( - SELECT - unioned.*, - CAST(NULL AS STRING) AS install_source, - CAST(NULL AS STRING) AS adjust_network - FROM - unioned -), -todays_metrics AS ( - SELECT - segment, - app_version, - attribution_medium, - attribution_source, - attribution_medium IS NOT NULL - OR attribution_source IS NOT NULL AS attributed, - city, - country, - distribution_id, - EXTRACT(YEAR FROM first_seen_date) AS first_seen_year, - is_default_browser, - COALESCE(REGEXP_EXTRACT(locale, r'^(.+?)-'), locale, NULL) AS locale, - app_name AS app_name, - normalized_channel AS channel, - normalized_os AS os, - normalized_os_version AS os_version, - os_version_major, - os_version_minor, - submission_date, - client_id, - uri_count, - active_hours_sum, - adjust_network, - install_source, - is_daily_user, - is_weekly_user, - is_monthly_user, - is_dau, - is_wau, - is_mau - FROM - unioned_with_attribution -) -SELECT - todays_metrics.* EXCEPT ( - client_id, - is_daily_user, - is_weekly_user, - is_monthly_user, - is_dau, - is_wau, - is_mau, - uri_count, - active_hours_sum - ), - COUNTIF(is_daily_user) AS daily_users, - COUNTIF(is_weekly_user) AS weekly_users, - COUNTIF(is_monthly_user) AS monthly_users, - COUNTIF(is_dau) AS dau, - COUNTIF(is_wau) AS wau, - COUNTIF(is_mau) AS mau, - SUM(uri_count) AS uri_count, - SUM(active_hours_sum) AS active_hours, -FROM - todays_metrics -GROUP BY - ALL diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/klar_android_derived/active_users_aggregates_v3/schema.yaml /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/klar_android_derived/active_users_aggregates_v3/schema.yaml --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/klar_android_derived/active_users_aggregates_v3/schema.yaml 2024-06-06 18:05:32.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/klar_android_derived/active_users_aggregates_v3/schema.yaml 1970-01-01 00:00:00.000000000 +0000 @@ -1,85 +0,0 @@ -fields: -- name: segment - type: STRING - mode: NULLABLE -- name: app_version - type: STRING - mode: NULLABLE -- name: attribution_medium - type: STRING - mode: NULLABLE -- name: attribution_source - type: STRING - mode: NULLABLE -- name: attributed - type: BOOLEAN - mode: NULLABLE -- name: city - type: STRING - mode: NULLABLE -- name: country - type: STRING - mode: NULLABLE -- name: distribution_id - type: STRING - mode: NULLABLE -- name: first_seen_year - type: INTEGER - mode: NULLABLE -- name: is_default_browser - type: BOOLEAN - mode: NULLABLE -- name: locale - type: STRING - mode: NULLABLE -- name: app_name - type: STRING - mode: NULLABLE -- name: channel - type: STRING - mode: NULLABLE -- name: os - type: STRING - mode: NULLABLE -- name: os_version - type: STRING - mode: NULLABLE -- name: os_version_major - type: INTEGER - mode: NULLABLE -- name: os_version_minor - type: INTEGER - mode: NULLABLE -- name: submission_date - type: DATE - mode: NULLABLE -- name: adjust_network - type: STRING - mode: NULLABLE -- name: install_source - type: STRING - mode: NULLABLE -- name: daily_users - type: INTEGER - mode: NULLABLE -- name: weekly_users - type: INTEGER - mode: NULLABLE -- name: monthly_users - type: INTEGER - mode: NULLABLE -- name: dau - type: INTEGER - mode: NULLABLE -- name: wau - type: INTEGER - mode: NULLABLE -- name: mau - type: INTEGER - mode: NULLABLE -- name: uri_count - type: INTEGER - mode: NULLABLE -- name: active_hours - type: FLOAT64 - mode: NULLABLE diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/monitoring_derived/event_monitoring_aggregates_v1/query.sql /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/monitoring_derived/event_monitoring_aggregates_v1/query.sql --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/monitoring_derived/event_monitoring_aggregates_v1/query.sql 2024-06-06 18:05:32.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/monitoring_derived/event_monitoring_aggregates_v1/query.sql 2024-06-06 18:09:32.000000000 +0000 @@ -45,7 +45,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.firefox_desktop_stable.events_v1` + `moz-fx-data-shared-prod.firefox_desktop_stable.newtab_v1` UNION ALL SELECT submission_timestamp, @@ -55,7 +55,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.firefox_desktop_stable.urlbar_potential_exposure_v1` + `moz-fx-data-shared-prod.firefox_desktop_stable.events_v1` UNION ALL SELECT submission_timestamp, @@ -65,7 +65,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.firefox_desktop_stable.newtab_v1` + `moz-fx-data-shared-prod.firefox_desktop_stable.prototype_no_code_events_v1` UNION ALL SELECT submission_timestamp, @@ -75,7 +75,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.firefox_desktop_stable.prototype_no_code_events_v1` + `moz-fx-data-shared-prod.firefox_desktop_stable.urlbar_potential_exposure_v1` ) CROSS JOIN UNNEST(events) AS event, @@ -1555,7 +1555,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.mozillavpn_stable.daemonsession_v1` + `moz-fx-data-shared-prod.mozillavpn_stable.vpnsession_v1` UNION ALL SELECT submission_timestamp, @@ -1575,7 +1575,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.mozillavpn_stable.vpnsession_v1` + `moz-fx-data-shared-prod.mozillavpn_stable.daemonsession_v1` ) CROSS JOIN UNNEST(events) AS event, @@ -1646,7 +1646,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.org_mozilla_firefox_vpn_stable.daemonsession_v1` + `moz-fx-data-shared-prod.org_mozilla_firefox_vpn_stable.vpnsession_v1` UNION ALL SELECT submission_timestamp, @@ -1666,7 +1666,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.org_mozilla_firefox_vpn_stable.vpnsession_v1` + `moz-fx-data-shared-prod.org_mozilla_firefox_vpn_stable.daemonsession_v1` ) CROSS JOIN UNNEST(events) AS event, @@ -1737,7 +1737,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.org_mozilla_ios_firefoxvpn_stable.daemonsession_v1` + `moz-fx-data-shared-prod.org_mozilla_ios_firefoxvpn_stable.vpnsession_v1` UNION ALL SELECT submission_timestamp, @@ -1757,7 +1757,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.org_mozilla_ios_firefoxvpn_stable.vpnsession_v1` + `moz-fx-data-shared-prod.org_mozilla_ios_firefoxvpn_stable.daemonsession_v1` ) CROSS JOIN UNNEST(events) AS event, @@ -1828,7 +1828,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.org_mozilla_ios_firefoxvpn_network_extension_stable.daemonsession_v1` + `moz-fx-data-shared-prod.org_mozilla_ios_firefoxvpn_network_extension_stable.vpnsession_v1` UNION ALL SELECT submission_timestamp, @@ -1848,7 +1848,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.org_mozilla_ios_firefoxvpn_network_extension_stable.vpnsession_v1` + `moz-fx-data-shared-prod.org_mozilla_ios_firefoxvpn_network_extension_stable.daemonsession_v1` ) CROSS JOIN UNNEST(events) AS event, @@ -2000,7 +2000,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.bedrock_stable.non_interaction_v1` + `moz-fx-data-shared-prod.bedrock_stable.events_v1` UNION ALL SELECT submission_timestamp, @@ -2010,7 +2010,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.bedrock_stable.events_v1` + `moz-fx-data-shared-prod.bedrock_stable.non_interaction_v1` ) CROSS JOIN UNNEST(events) AS event, @@ -2081,7 +2081,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.viu_politica_stable.video_index_v1` + `moz-fx-data-shared-prod.viu_politica_stable.main_events_v1` UNION ALL SELECT submission_timestamp, @@ -2091,7 +2091,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.viu_politica_stable.main_events_v1` + `moz-fx-data-shared-prod.viu_politica_stable.video_index_v1` ) CROSS JOIN UNNEST(events) AS event, @@ -2162,7 +2162,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.firefox_desktop_background_tasks_stable.events_v1` + `moz-fx-data-shared-prod.firefox_desktop_background_tasks_stable.background_tasks_v1` UNION ALL SELECT submission_timestamp, @@ -2172,7 +2172,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.firefox_desktop_background_tasks_stable.background_tasks_v1` + `moz-fx-data-shared-prod.firefox_desktop_background_tasks_stable.events_v1` ) CROSS JOIN UNNEST(events) AS event, diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/mozillavpn_derived/event_monitoring_live_v1/materialized_view.sql /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/mozillavpn_derived/event_monitoring_live_v1/materialized_view.sql --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/mozillavpn_derived/event_monitoring_live_v1/materialized_view.sql 2024-06-06 18:05:31.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/mozillavpn_derived/event_monitoring_live_v1/materialized_view.sql 2024-06-06 18:07:53.000000000 +0000 @@ -50,7 +50,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.mozillavpn_live.daemonsession_v1` + `moz-fx-data-shared-prod.mozillavpn_live.vpnsession_v1` UNION ALL SELECT submission_timestamp, @@ -70,7 +70,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.moz ```

⚠️ Only part of the diff is displayed.

Link to full diff

dataops-ci-bot commented 1 month ago

Integration report for "Merge remote-tracking branch 'og/main' into glam-js-to-sql"

sql.diff

Click to expand! ```diff diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/mozfun/glam/histogram_cast_json/udf.sql /tmp/workspace/generated-sql/sql/mozfun/glam/histogram_cast_json/udf.sql --- /tmp/workspace/main-generated-sql/sql/mozfun/glam/histogram_cast_json/udf.sql 2024-06-06 19:04:02.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/mozfun/glam/histogram_cast_json/udf.sql 2024-06-06 19:04:00.000000000 +0000 @@ -1,17 +1,27 @@ --- udf_js_flatten +/* +Casts an ARRAY> histogram to a JSON string. +This implementation uses String concatenation instead of +BigQuery native JSON (TO_JSON / JSON_OBJECT) functions to +preserve order. +https://cloud.google.com/bigquery/docs/reference/standard-sql/json_functions#json_object +Order is important for GLAM histograms so other UDFs that +operate on them, such as glam.percentile, can work correctly. +*/ CREATE OR REPLACE FUNCTION glam.histogram_cast_json( histogram ARRAY> ) -RETURNS STRING DETERMINISTIC -LANGUAGE js -AS - ''' - let obj = {}; - histogram.map(r => { - obj[r.key] = parseFloat(r.value.toFixed(4)); - }); - return JSON.stringify(obj); -'''; +RETURNS STRING AS ( + ( + SELECT + CONCAT( + '{', + STRING_AGG(CONCAT('"', key, '":', ROUND(value, 4)) ORDER BY CAST(key AS FLOAT64)), + '}' + ) + FROM + UNNEST(histogram) + ) +); SELECT assert.equals( @@ -19,4 +29,16 @@ glam.histogram_cast_json( ARRAY>[("0", 0.111111), ("1", 2.0 / 3), ("2", 0)] ) + ), + assert.equals( + '{"0":0.1111,"1":0.6667,"2":0,"10":100}', + glam.histogram_cast_json( + ARRAY>[ + ("0", 0.111111), + ("1", 2.0 / 3), + ("10", 100), + ("2", 0) + ] ) + ), + assert.equals('{}', glam.histogram_cast_json(ARRAY>[])), diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/mozfun/glam/histogram_generate_functional_buckets/udf.sql /tmp/workspace/generated-sql/sql/mozfun/glam/histogram_generate_functional_buckets/udf.sql --- /tmp/workspace/main-generated-sql/sql/mozfun/glam/histogram_generate_functional_buckets/udf.sql 2024-06-06 19:04:02.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/mozfun/glam/histogram_generate_functional_buckets/udf.sql 2024-06-06 19:04:00.000000000 +0000 @@ -4,34 +4,29 @@ buckets_per_magnitude INT64, range_max INT64 ) -RETURNS ARRAY DETERMINISTIC -LANGUAGE js -AS - ''' - function sample_to_bucket_index(sample) { - // Get the index of the sample - // https://github.com/mozilla/glean/blob/main/glean-core/src/histogram/functional.rs - let exponent = Math.pow(log_base, 1.0/buckets_per_magnitude); - return Math.ceil(Math.log(sample + 1) / Math.log(exponent)); - } - - let buckets = new Set([0]); - for (let index = 0; index < sample_to_bucket_index(range_max); index++) { - - // Avoid re-using the exponent due to floating point issues when carrying - // the `pow` operation e.g. `let exponent = ...; Math.pow(exponent, index)`. - let bucket = Math.floor(Math.pow(log_base, index/buckets_per_magnitude)); - - // NOTE: the sample_to_bucket_index implementation overshoots the true index, - // so we break out early if we hit the max bucket range. - if (bucket > range_max) { - break; - } - buckets.add(bucket); - } - - return [...buckets] -'''; +RETURNS ARRAY AS ( + ( + WITH bucket_indexes AS ( + -- Generate all bucket indexes + -- https://github.com/mozilla/glean/blob/main/glean-core/src/histogram/functional.rs + SELECT + GENERATE_ARRAY(0, CEIL(LOG(range_max + 1, log_base) * buckets_per_magnitude)) AS indexes + ), + buckets AS ( + SELECT + FLOOR(POW(log_base, (idx) / buckets_per_magnitude)) AS bucket + FROM + bucket_indexes, + UNNEST(indexes) AS idx + WHERE + FLOOR(POW(log_base, (idx) / buckets_per_magnitude)) <= range_max + ) + SELECT + ARRAY_CONCAT([0.0], ARRAY_AGG(DISTINCT(bucket) ORDER BY bucket)) + FROM + buckets + ) +); SELECT -- First 50 keys of a timing distribution diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/mozfun/glam/histogram_generate_linear_buckets/udf.sql /tmp/workspace/generated-sql/sql/mozfun/glam/histogram_generate_linear_buckets/udf.sql --- /tmp/workspace/main-generated-sql/sql/mozfun/glam/histogram_generate_linear_buckets/udf.sql 2024-06-06 19:04:02.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/mozfun/glam/histogram_generate_linear_buckets/udf.sql 2024-06-06 19:04:00.000000000 +0000 @@ -4,17 +4,17 @@ max FLOAT64, nBuckets FLOAT64 ) -RETURNS ARRAY DETERMINISTIC -LANGUAGE js -AS - ''' - let result = [0]; - for (let i = 1; i < Math.min(nBuckets, max, 10000); i++) { - let linearRange = (min * (nBuckets - 1 - i) + max * (i - 1)) / (nBuckets - 2); - result.push(Math.round(linearRange)); - } - return result; -'''; +RETURNS ARRAY AS ( + ARRAY_CONCAT( + [0.0], + ARRAY( + SELECT + ROUND((min * (nBuckets - 1 - i) + max * (i - 1)) / (nBuckets - 2)) + FROM + UNNEST(GENERATE_ARRAY(1, LEAST(nBuckets - 1, max, 10000))) AS i + ) + ) +); SELECT -- Buckets of CONTENT_FRAME_TIME_VSYNC diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/mozfun/glam/histogram_generate_scalar_buckets/udf.sql /tmp/workspace/generated-sql/sql/mozfun/glam/histogram_generate_scalar_buckets/udf.sql --- /tmp/workspace/main-generated-sql/sql/mozfun/glam/histogram_generate_scalar_buckets/udf.sql 2024-06-06 19:04:02.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/mozfun/glam/histogram_generate_scalar_buckets/udf.sql 2024-06-06 19:04:00.000000000 +0000 @@ -4,17 +4,14 @@ max_bucket FLOAT64, num_buckets INT64 ) -RETURNS ARRAY DETERMINISTIC -LANGUAGE js -AS - ''' - let bucket_size = (max_bucket - min_bucket) / num_buckets; - let buckets = new Set(); - for (let bucket = min_bucket; bucket < max_bucket; bucket += bucket_size) { - buckets.add(Math.pow(2, bucket).toFixed(2)); - } - return Array.from(buckets); -'''; +RETURNS ARRAY AS ( + ARRAY( + SELECT + ROUND(POW(2, (max_bucket - min_bucket) / num_buckets * val), 2) + FROM + UNNEST(GENERATE_ARRAY(0, num_buckets - 1)) AS val + ) +); SELECT assert.array_equals([1, 2, 4, 8], glam.histogram_generate_scalar_buckets(0, LOG(16, 2), 4)), diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/mozfun/glam/percentile/udf.sql /tmp/workspace/generated-sql/sql/mozfun/glam/percentile/udf.sql --- /tmp/workspace/main-generated-sql/sql/mozfun/glam/percentile/udf.sql 2024-06-06 19:04:02.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/mozfun/glam/percentile/udf.sql 2024-06-06 19:04:00.000000000 +0000 @@ -1,37 +1,41 @@ -- udf_js.glean_percentile CREATE OR REPLACE FUNCTION glam.percentile( - percentile FLOAT64, + pct FLOAT64, histogram ARRAY>, type STRING ) -RETURNS FLOAT64 DETERMINISTIC -LANGUAGE js -AS - ''' - if (percentile < 0 || percentile > 100) { - throw "percentile must be a value between 0 and 100"; - } - - let values = histogram.map(bucket => bucket.value); - let total = values.reduce((a, b) => a + b); - let normalized = values.map(value => value / total); - - // Find the index into the cumulative distribution function that corresponds - // to the percentile. This undershoots the true value of the percentile. - let acc = 0; - let index = null; - for (let i = 0; i < normalized.length; i++) { - acc += normalized[i]; - index = i; - if (acc >= percentile / 100) { - break; - } - } - - // NOTE: we do not perform geometric or linear interpolation, but this would - // be the place to implement it. - return histogram[index].key; -'''; +RETURNS FLOAT64 AS ( + ( + WITH check AS ( + SELECT + IF( + pct >= 0 + AND pct <= 100, + TRUE, + ERROR('percentile must be a value between 0 and 100') + ) pct_ok + ), + keyed_cum_sum AS ( + SELECT + key, + SUM(value) OVER (ORDER BY CAST(key AS FLOAT64)) / SUM(value) OVER () AS cum_sum + FROM + UNNEST(histogram) + ) + SELECT + CAST(key AS FLOAT64) + FROM + keyed_cum_sum, + check + WHERE + check.pct_ok + AND cum_sum >= pct / 100 + ORDER BY + cum_sum + LIMIT + 1 + ) +); SELECT assert.equals( @@ -41,6 +45,30 @@ ARRAY>[("0", 1), ("2", 2), ("3", 1)], "timing_distribution" ) + ), + assert.equals( + 3, + glam.percentile( + 100.0, + ARRAY>[("0", 1), ("2", 2), ("3", 1)], + "timing_distribution" + ) + ), + assert.equals( + 0, + glam.percentile( + 0.0, + ARRAY>[("0", 1), ("2", 2), ("3", 1)], + "timing_distribution" + ) + ), + assert.equals( + 2, + glam.percentile( + 2.0, + ARRAY>[("0", 1), ("2", 2), ("10", 10), ("11", 100)], + "timing_distribution" + ) ); #xfail ```

Link to full diff

dataops-ci-bot commented 1 month ago

Integration report for "Fix generate_scalar_buckets"

sql.diff

Click to expand! ```diff Only in /tmp/workspace/main-generated-sql/sql/mozfun/glam: histogram_normalized_sum_with_original diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/mozfun/glam/histogram_cast_json/udf.sql /tmp/workspace/generated-sql/sql/mozfun/glam/histogram_cast_json/udf.sql --- /tmp/workspace/main-generated-sql/sql/mozfun/glam/histogram_cast_json/udf.sql 2024-06-06 19:47:17.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/mozfun/glam/histogram_cast_json/udf.sql 2024-06-06 19:47:49.000000000 +0000 @@ -1,17 +1,27 @@ --- udf_js_flatten +/* +Casts an ARRAY> histogram to a JSON string. +This implementation uses String concatenation instead of +BigQuery native JSON (TO_JSON / JSON_OBJECT) functions to +preserve order. +https://cloud.google.com/bigquery/docs/reference/standard-sql/json_functions#json_object +Order is important for GLAM histograms so other UDFs that +operate on them, such as glam.percentile, can work correctly. +*/ CREATE OR REPLACE FUNCTION glam.histogram_cast_json( histogram ARRAY> ) -RETURNS STRING DETERMINISTIC -LANGUAGE js -AS - ''' - let obj = {}; - histogram.map(r => { - obj[r.key] = parseFloat(r.value.toFixed(4)); - }); - return JSON.stringify(obj); -'''; +RETURNS STRING AS ( + ( + SELECT + CONCAT( + '{', + STRING_AGG(CONCAT('"', key, '":', ROUND(value, 4)) ORDER BY CAST(key AS FLOAT64)), + '}' + ) + FROM + UNNEST(histogram) + ) +); SELECT assert.equals( @@ -19,4 +29,16 @@ glam.histogram_cast_json( ARRAY>[("0", 0.111111), ("1", 2.0 / 3), ("2", 0)] ) + ), + assert.equals( + '{"0":0.1111,"1":0.6667,"2":0,"10":100}', + glam.histogram_cast_json( + ARRAY>[ + ("0", 0.111111), + ("1", 2.0 / 3), + ("10", 100), + ("2", 0) + ] ) + ), + assert.equals('{}', glam.histogram_cast_json(ARRAY>[])), diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/mozfun/glam/histogram_generate_functional_buckets/udf.sql /tmp/workspace/generated-sql/sql/mozfun/glam/histogram_generate_functional_buckets/udf.sql --- /tmp/workspace/main-generated-sql/sql/mozfun/glam/histogram_generate_functional_buckets/udf.sql 2024-06-06 19:47:17.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/mozfun/glam/histogram_generate_functional_buckets/udf.sql 2024-06-06 19:47:49.000000000 +0000 @@ -4,34 +4,29 @@ buckets_per_magnitude INT64, range_max INT64 ) -RETURNS ARRAY DETERMINISTIC -LANGUAGE js -AS - ''' - function sample_to_bucket_index(sample) { - // Get the index of the sample - // https://github.com/mozilla/glean/blob/main/glean-core/src/histogram/functional.rs - let exponent = Math.pow(log_base, 1.0/buckets_per_magnitude); - return Math.ceil(Math.log(sample + 1) / Math.log(exponent)); - } - - let buckets = new Set([0]); - for (let index = 0; index < sample_to_bucket_index(range_max); index++) { - - // Avoid re-using the exponent due to floating point issues when carrying - // the `pow` operation e.g. `let exponent = ...; Math.pow(exponent, index)`. - let bucket = Math.floor(Math.pow(log_base, index/buckets_per_magnitude)); - - // NOTE: the sample_to_bucket_index implementation overshoots the true index, - // so we break out early if we hit the max bucket range. - if (bucket > range_max) { - break; - } - buckets.add(bucket); - } - - return [...buckets] -'''; +RETURNS ARRAY AS ( + ( + WITH bucket_indexes AS ( + -- Generate all bucket indexes + -- https://github.com/mozilla/glean/blob/main/glean-core/src/histogram/functional.rs + SELECT + GENERATE_ARRAY(0, CEIL(LOG(range_max + 1, log_base) * buckets_per_magnitude)) AS indexes + ), + buckets AS ( + SELECT + FLOOR(POW(log_base, (idx) / buckets_per_magnitude)) AS bucket + FROM + bucket_indexes, + UNNEST(indexes) AS idx + WHERE + FLOOR(POW(log_base, (idx) / buckets_per_magnitude)) <= range_max + ) + SELECT + ARRAY_CONCAT([0.0], ARRAY_AGG(DISTINCT(bucket) ORDER BY bucket)) + FROM + buckets + ) +); SELECT -- First 50 keys of a timing distribution diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/mozfun/glam/histogram_generate_linear_buckets/udf.sql /tmp/workspace/generated-sql/sql/mozfun/glam/histogram_generate_linear_buckets/udf.sql --- /tmp/workspace/main-generated-sql/sql/mozfun/glam/histogram_generate_linear_buckets/udf.sql 2024-06-06 19:47:17.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/mozfun/glam/histogram_generate_linear_buckets/udf.sql 2024-06-06 19:47:49.000000000 +0000 @@ -4,17 +4,17 @@ max FLOAT64, nBuckets FLOAT64 ) -RETURNS ARRAY DETERMINISTIC -LANGUAGE js -AS - ''' - let result = [0]; - for (let i = 1; i < Math.min(nBuckets, max, 10000); i++) { - let linearRange = (min * (nBuckets - 1 - i) + max * (i - 1)) / (nBuckets - 2); - result.push(Math.round(linearRange)); - } - return result; -'''; +RETURNS ARRAY AS ( + ARRAY_CONCAT( + [0.0], + ARRAY( + SELECT + ROUND((min * (nBuckets - 1 - i) + max * (i - 1)) / (nBuckets - 2)) + FROM + UNNEST(GENERATE_ARRAY(1, LEAST(nBuckets - 1, max, 10000))) AS i + ) + ) +); SELECT -- Buckets of CONTENT_FRAME_TIME_VSYNC diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/mozfun/glam/histogram_generate_scalar_buckets/udf.sql /tmp/workspace/generated-sql/sql/mozfun/glam/histogram_generate_scalar_buckets/udf.sql --- /tmp/workspace/main-generated-sql/sql/mozfun/glam/histogram_generate_scalar_buckets/udf.sql 2024-06-06 19:47:17.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/mozfun/glam/histogram_generate_scalar_buckets/udf.sql 2024-06-06 19:47:49.000000000 +0000 @@ -4,21 +4,23 @@ max_bucket FLOAT64, num_buckets INT64 ) -RETURNS ARRAY DETERMINISTIC -LANGUAGE js -AS - ''' - let bucket_size = (max_bucket - min_bucket) / num_buckets; - let buckets = new Set(); - for (let bucket = min_bucket; bucket < max_bucket; bucket += bucket_size) { - buckets.add(Math.pow(2, bucket).toFixed(2)); - } - return Array.from(buckets); -'''; +RETURNS ARRAY AS ( + IF( + min_bucket >= max_bucket, + [], + ARRAY( + SELECT + ROUND(POW(2, (max_bucket - min_bucket) / num_buckets * val), 2) + FROM + UNNEST(GENERATE_ARRAY(0, num_buckets - 1)) AS val + ) + ) +); SELECT assert.array_equals([1, 2, 4, 8], glam.histogram_generate_scalar_buckets(0, LOG(16, 2), 4)), assert.array_equals( [1, 1.9, 3.62, 6.9, 13.13], glam.histogram_generate_scalar_buckets(0, LOG(25, 2), 5) - ) + ), + assert.array_equals([], glam.histogram_generate_scalar_buckets(10, 10, 100)) diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/mozfun/glam/histogram_normalized_sum_with_original/metadata.yaml /tmp/workspace/generated-sql/sql/mozfun/glam/histogram_normalized_sum_with_original/metadata.yaml --- /tmp/workspace/main-generated-sql/sql/mozfun/glam/histogram_normalized_sum_with_original/metadata.yaml 2024-06-06 19:47:17.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/mozfun/glam/histogram_normalized_sum_with_original/metadata.yaml 1970-01-01 00:00:00.000000000 +0000 @@ -1,3 +0,0 @@ -description: | - Compute the normalized and the non-normalized sum of an array of histograms. -friendly_name: Histogram normalized sum with original diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/mozfun/glam/histogram_normalized_sum_with_original/udf.sql /tmp/workspace/generated-sql/sql/mozfun/glam/histogram_normalized_sum_with_original/udf.sql --- /tmp/workspace/main-generated-sql/sql/mozfun/glam/histogram_normalized_sum_with_original/udf.sql 2024-06-06 19:47:17.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/mozfun/glam/histogram_normalized_sum_with_original/udf.sql 1970-01-01 00:00:00.000000000 +0000 @@ -1,89 +0,0 @@ --- udf_normalized_sum_with_original -CREATE OR REPLACE FUNCTION glam.histogram_normalized_sum_with_original( - arrs ARRAY>, - weight FLOAT64 -) -RETURNS ARRAY> AS ( - -- Input: one histogram for a single client. - -- Returns the normalized and the non-normalized sum of the input maps. - -- It returns the total_count[k] / SUM(total_count) and total_count[k] - -- for each key k. - ( - WITH total_counts AS ( - SELECT - SUM(a.value) AS total_count - FROM - UNNEST(arrs) AS a - ), - summed_counts AS ( - SELECT - a.key AS k, - SUM(a.value) AS v - FROM - UNNEST(arrs) AS a - GROUP BY - a.key - ) - SELECT - ARRAY_AGG( - STRUCT( - k, - COALESCE(SAFE_DIVIDE(1.0 * v, total_count), 0) * weight, - 1.0 * v - ) - ORDER BY - SAFE_CAST(k AS INT64) - ) - FROM - summed_counts - CROSS JOIN - total_counts - ) -); - -SELECT - assert.array_equals( - ARRAY>[ - ("0", 0.25, 1.0), - ("1", 0.25, 1.0), - ("2", 0.5, 2.0) - ], - glam.histogram_normalized_sum_with_original( - ARRAY>[("0", 1), ("1", 1), ("2", 2)], - 1.0 - ) - ), - assert.array_equals( - ARRAY>[ - ("0", 0.5, 1.0), - ("1", 0.5, 1.0), - ("2", 1.0, 2.0) - ], - glam.histogram_normalized_sum_with_original( - ARRAY>[("0", 1), ("1", 1), ("2", 2)], - 2.0 - ) - ), - -- out of order keys - assert.array_equals( - ARRAY>[ - ("2", 0.5, 1.0), - ("11", 0.5, 1.0) - ], - glam.histogram_normalized_sum_with_original( - ARRAY>[("11", 1), ("2", 1)], - 1 - ) - ), - -- different inputs for same bucket - assert.array_equals( - ARRAY>[ - ("0", 0.5, 2.0), - ("1", 0.25, 1.0), - ("2", 0.25, 1.0) - ], - glam.histogram_normalized_sum_with_original( - ARRAY>[("0", 1), ("0", 1), ("1", 1), ("2", 1)], - 1 - ) - ) diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/mozfun/glam/percentile/udf.sql /tmp/workspace/generated-sql/sql/mozfun/glam/percentile/udf.sql --- /tmp/workspace/main-generated-sql/sql/mozfun/glam/percentile/udf.sql 2024-06-06 19:47:17.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/mozfun/glam/percentile/udf.sql 2024-06-06 19:47:49.000000000 +0000 @@ -1,37 +1,41 @@ -- udf_js.glean_percentile CREATE OR REPLACE FUNCTION glam.percentile( - percentile FLOAT64, + pct FLOAT64, histogram ARRAY>, type STRING ) -RETURNS FLOAT64 DETERMINISTIC -LANGUAGE js -AS - ''' - if (percentile < 0 || percentile > 100) { - throw "percentile must be a value between 0 and 100"; - } - - let values = histogram.map(bucket => bucket.value); - let total = values.reduce((a, b) => a + b); - let normalized = values.map(value => value / total); - - // Find the index into the cumulative distribution function that corresponds - // to the percentile. This undershoots the true value of the percentile. - let acc = 0; - let index = null; - for (let i = 0; i < normalized.length; i++) { - acc += normalized[i]; - index = i; - if (acc >= percentile / 100) { - break; - } - } - - // NOTE: we do not perform geometric or linear interpolation, but this would - // be the place to implement it. - return histogram[index].key; -'''; +RETURNS FLOAT64 AS ( + ( + WITH check AS ( + SELECT + IF( + pct >= 0 + AND pct <= 100, + TRUE, + ERROR('percentile must be a value between 0 and 100') + ) pct_ok + ), + keyed_cum_sum AS ( + SELECT + key, + SUM(value) OVER (ORDER BY CAST(key AS FLOAT64)) / SUM(value) OVER () AS cum_sum + FROM + UNNEST(histogram) + ) + SELECT + CAST(key AS FLOAT64) + FROM + keyed_cum_sum, + check + WHERE + check.pct_ok + AND cum_sum >= pct / 100 + ORDER BY + cum_sum + LIMIT + 1 + ) +); SELECT assert.equals( @@ -41,6 +45,30 @@ ARRAY>[("0", 1), ("2", 2), ("3", 1)], "timing_distribution" ) + ), + assert.equals( + 3, + glam.percentile( + 100.0, + ARRAY>[("0", 1), ("2", 2), ("3", 1)], + "timing_distribution" + ) + ), + assert.equals( + 0, + glam.percentile( + 0.0, + ARRAY>[("0", 1), ("2", 2), ("3", 1)], + "timing_distribution" + ) + ), + assert.equals( + 2, + glam.percentile( + 2.0, + ARRAY>[("0", 1), ("2", 2), ("10", 10), ("11", 100)], + "timing_distribution" + ) ); #xfail diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-glam-prod-fca7/glam_etl/glam_fenix_beta_aggregates/schema.yaml /tmp/workspace/generated-sql/sql/moz-fx-data-glam-prod-fca7/glam_etl/glam_fenix_beta_aggregates/schema.yaml --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-glam-prod-fca7/glam_etl/glam_fenix_beta_aggregates/schema.yaml 2024-06-06 19:47:17.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-glam-prod-fca7/glam_etl/glam_fenix_beta_aggregates/schema.yaml 2024-06-06 19:47:49.000000000 +0000 @@ -38,12 +38,6 @@ - name: percentiles type: STRING mode: NULLABLE -- name: non_norm_histogram - type: STRING - mode: NULLABLE -- name: non_norm_percentiles - type: STRING - mode: NULLABLE - name: app_id type: STRING mode: NULLABLE diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-glam-prod-fca7/glam_etl/glam_fenix_beta_aggregates_v1/schema.yaml /tmp/workspace/generated-sql/sql/moz-fx-data-glam-prod-fca7/glam_etl/glam_fenix_beta_aggregates_v1/schema.yaml --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-glam-prod-fca7/glam_etl/glam_fenix_beta_aggregates_v1/schema.yaml 2024-06-06 19:47:17.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-glam-prod-fca7/glam_etl/glam_fenix_beta_aggregates_v1/schema.yaml 2024-06-06 19:47:49.000000000 +0000 @@ -38,12 +38,6 @@ - name: percentiles type: STRING mode: NULLABLE -- name: non_norm_histogram - type: STRING - mode: NULLABLE -- name: non_norm_percentiles - type: STRING - mode: NULLABLE - name: app_id type: STRING mode: NULLABLE diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-glam-prod-fca7/glam_etl/glam_fenix_beta_aggregates_v1/script.sql /tmp/workspace/generated-sql/sql/moz-fx-data-glam-prod-fca7/glam_etl/glam_fenix_beta_aggregates_v1/script.sql --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-glam-prod-fca7/glam_etl/glam_fenix_beta_aggregates_v1/script.sql 2024-06-06 19:47:17.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-glam-prod-fca7/glam_etl/glam_fenix_beta_aggregates_v1/script.sql 2024-06-06 19:47:49.000000000 +0000 @@ -33,8 +33,6 @@ total_users, histogram, percentiles, - non_norm_histogram, - non_norm_percentiles, total_sample ) VALUES @@ -51,8 +49,6 @@ S.total_users, S.histogram, S.percentiles, - S.non_norm_histogram, - S.non_norm_percentiles, S.total_sample ) WHEN MATCHED @@ -61,6 +57,4 @@ SET T.total_users = S.total_users, T.histogram = S.histogram, T.percentiles = S.percentiles, - T.non_norm_histogram = S.non_norm_histogram, - T.non_norm_percentiles = S.non_norm_percentiles, T.total_sample = S.total_sample diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-glam-prod-fca7/glam_etl/glam_fenix_nightly_aggregates/schema.yaml /tmp/workspace/generated-sql/sql/moz-fx-data-glam-prod-fca7/glam_etl/glam_fenix_nightly_aggregates/schema.yaml --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-glam-prod-fca7/glam_etl/glam_fenix_nightly_aggregates/schema.yaml 2024-06-06 19:47:17.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-glam-prod-fca7/glam_etl/glam_fenix_nightly_aggregates/schema.yaml 2024-06-06 19:47:49.000000000 +0000 @@ -38,12 +38,6 @@ - name: percentiles type: STRING mode: NULLABLE -- name: non_norm_histogram - type: STRING - mode: NULLABLE -- name: non_norm_percentiles - type: STRING - mode: NULLABLE - name: app_id type: STRING mode: NULLABLE diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-glam-prod-fca7/glam_etl/glam_fenix_nightly_aggregates_v1/schema.yaml /tmp/workspace/generated-sql/sql/moz-fx-data-glam-prod-fca7/glam_etl/glam_fenix_nightly_aggregates_v1/schema.yaml --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-glam-prod-fca7/glam_etl/glam_fenix_nightly_aggregates_v1/schema.yaml 2024-06-06 19:47:17.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-glam-prod-fca7/glam_etl/glam_fenix_nightly_aggregates_v1/schema.yaml 2024-06-06 19:47:49.000000000 +0000 @@ -38,12 +38,6 @@ - name: percentiles type: STRING mode: NULLABLE -- name: non_norm_histogram - type: STRING - mode: NULLABLE -- name: non_norm_percentiles - type: STRING - mode: NULLABLE - name: app_id type: STRING mode: NULLABLE diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-glam-prod-fca7/glam_etl/glam_fenix_nightly_aggregates_v1/script.sql /tmp/workspace/generated-sql/sql/moz-fx-data-glam-prod-fca7/glam_etl/glam_fenix_nightly_aggregates_v1/script.sql --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-glam-prod-fca7/glam_etl/glam_fenix_nightly_aggregates_v1/script.sql 2024-06-06 19:47:17.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-glam-prod-fca7/glam_etl/glam_fenix_nightly_aggregates_v1/script.sql 2024-06-06 19:47:49.000000000 +0000 @@ -33,8 +33,6 @@ total_users, histogram, percentiles, - non_norm_histogram, - non_norm_percentiles, total_sample ) VALUES @@ -51,8 +49,6 @@ S.total_users, S.histogram, S.percentiles, - S.non_norm_histogram, - S.non_norm_percentiles, S.total_sample ) WHEN MATCHED @@ -61,6 +57,4 @@ SET T.total_users = S.total_users, T.histogram = S.histogram, T.percentiles = S.percentiles, - T.non_norm_histogram = S.non_norm_histogram, - T.non_norm_percentiles = S.non_norm_percentiles, T.total_sample = S.total_sample diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-glam-prod-fca7/glam_etl/glam_fenix_release_aggregates/schema.yaml /tmp/workspace/generated-sql/sql/moz-fx-data-glam-prod-fca7/glam_etl/glam_fenix_release_aggregates/schema.yaml --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-glam-prod-fca7/glam_etl/glam_fenix_release_aggregates/schema.yaml 2024-06-06 19:47:17.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-glam-prod-fca7/glam_etl/glam_fenix_release_aggregates/schema.yaml 2024-06-06 19:47:49.000000000 +0000 @@ -38,12 +38,6 @@ - name: percentiles type: STRING mode: NULLABLE -- name: non_norm_histogram - type: STRING - mode: NULLABLE -- name: non_norm_percentiles - type: STRING - mode: NULLABLE - name: app_id type: STRING mode: NULLABLE diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-glam-prod-fca7/glam_etl/glam_fenix_release_aggregates_v1/schema.yaml /tmp/workspace/generated-sql/sql/moz-fx-data-glam-prod-fca7/glam_etl/glam_fenix_release_aggregates_v1/schema.yaml --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-glam-prod-fca7/glam_etl/glam_fenix_release_aggregates_v1/schema.yaml 2024-06-06 19:47:17.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-glam-prod-fca7/glam_etl/glam_fenix_release_aggregates_v1/schema.yaml 2024-06-06 19:47:49.000000000 +0000 @@ -38,12 +38,6 @@ - name: percentiles type: STRING mode: NULLABLE -- name: non_norm_histogram - type: STRING - mode: NULLABLE -- name: non_norm_percentiles - type: STRING - mode: NULLABLE - name: app_id type: STRING mode: NULLABLE diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-glam-prod-fca7/glam_etl/glam_fenix_release_aggregates_v1/script.sql /tmp/workspace/generated-sql/sql/moz-fx-data-glam-prod-fca7/glam_etl/glam_fenix_release_aggregates_v1/script.sql --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-glam-prod-fca7/glam_etl/glam_fenix_release_aggregates_v1/script.sql 2024-06-06 19:47:17.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-glam-prod-fca7/glam_etl/glam_fenix_release_aggregates_v1/script.sql 2024-06-06 19:47:49.000000000 +0000 @@ -33,8 +33,6 @@ total_users, histogram, percentiles, - non_norm_histogram, - non_norm_percentiles, total_sample ) VALUES @@ -51,8 +49,6 @@ S.total_users, S.histogram, S.percentiles, - S.non_norm_histogram, - S.non_norm_percentiles, S.total_sample ) WHEN MATCHED @@ -61,6 +57,4 @@ SET T.total_users = S.total_users, T.histogram = S.histogram, T.percentiles = S.percentiles, - T.non_norm_histogram = S.non_norm_histogram, - T.non_norm_percentiles = S.non_norm_percentiles, T.total_sample = S.total_sample diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-glam-prod-fca7/glam_etl/glam_fog_beta_aggregates/schema.yaml /tmp/workspace/generated-sql/sql/moz-fx-data-glam-prod-fca7/glam_etl/glam_fog_beta_aggregates/schema.yaml --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-glam-prod-fca7/glam_etl/glam_fog_beta_aggregates/schema.yaml 2024-06-06 19:47:17.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-glam-prod-fca7/glam_etl/glam_fog_beta_aggregates/schema.yaml 2024-06-06 19:47:49.000000000 +0000 @@ -41,12 +41,6 @@ - name: percentiles type: STRING mode: NULLABLE -- name: non_norm_histogram - type: STRING - mode: NULLABLE -- name: non_norm_percentiles - type: STRING - mode: NULLABLE - name: total_sample type: BIGNUMERIC mode: NULLABLE diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-glam-prod-fca7/glam_etl/glam_fog_beta_aggregates_v1/schema.yaml /tmp/workspace/generated-sql/sql/moz-fx-data-glam-prod-fca7/glam_etl/glam_fog_beta_aggregates_v1/schema.yaml --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-glam-prod-fca7/glam_etl/glam_fog_beta_aggregates_v1/schema.yaml 2024-06-06 19:47:17.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-glam-prod-fca7/glam_etl/glam_fog_beta_aggregates_v1/schema.yaml 2024-06-06 19:47:49.000000000 +0000 @@ -41,12 +41,6 @@ - name: percentiles type: STRING mode: NULLABLE -- name: non_norm_histogram - type: STRING - mode: NULLABLE -- name: non_norm_percentiles - type: STRING - mode: NULLABLE - name: total_sample type: BIGNUMERIC mode: NULLABLE diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-glam-prod-fca7/glam_etl/glam_fog_beta_aggregates_v1/script.sql /tmp/workspace/generated-sql/sql/moz-fx-data-glam-prod-fca7/glam_etl/glam_fog_beta_aggregates_v1/script.sql --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-glam-prod-fca7/glam_etl/glam_fog_beta_aggregates_v1/script.sql 2024-06-06 19:47:17.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-glam-prod-fca7/glam_etl/glam_fog_beta_aggregates_v1/script.sql 2024-06-06 19:47:49.000000000 +0000 @@ -33,8 +33,6 @@ total_users, histogram, percentiles, - non_norm_histogram, - non_norm_percentiles, total_sample ) VALUES @@ -51,8 +49,6 @@ S.total_users, S.histogram, S.percentiles, - S.non_norm_histogram, - S.non_norm_percentiles, S.total_sample ) WHEN MATCHED @@ -61,6 +57,4 @@ SET T.total_users = S.total_users, T.histogram = S.histogram, T.percentiles = S.percentiles, - T.non_norm_histogram = S.non_norm_histogram, - T.non_norm_percentiles = S.non_norm_percentiles, T.total_sample = S.total_sample diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-glam-prod-fca7/glam_etl/glam_fog_nightly_aggregates/schema.yaml /tmp/workspace/generated-sql/sql/moz-fx-data-glam-prod-fca7/glam_etl/glam_fog_nightly_aggregates/schema.yaml --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-glam-prod-fca7/glam_etl/glam_fog_nightly_aggregates/schema.yaml 2024-06-06 19:47:17.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-glam-prod-fca7/glam_etl/glam_fog_nightly_aggregates/schema.yaml 2024-06-06 19:47:49.000000000 +0000 @@ -41,12 +41,6 @@ - name: percentiles type: STRING mode: NULLABLE -- name: non_norm_histogram - type: STRING - mode: NULLABLE -- name: non_norm_percentiles - type: STRING - mode: NULLABLE - name: total_sample type: BIGNUMERIC mode: NULLABLE diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-glam-prod-fca7/glam_etl/glam_fog_nightly_aggregates_v1/schema.yaml /tmp/workspace/generated-sql/sql/moz-fx-data-glam-prod-fca7/glam_etl/glam_fog_nightly_aggregates_v1/schema.yaml --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-glam-prod-fca7/glam_etl/glam_fog_nightly_aggregates_v1/schema.yaml 2024-06-06 19:47:17.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-glam-prod-fca7/glam_etl/glam_fog_nightly_aggregates_v1/schema.yaml 2024-06-06 19:47:49.000000000 +0000 @@ -41,12 +41,6 @@ - name: percentiles type: STRING mode: NULLABLE -- name: non_norm_histogram - type: STRING - mode: NULLABLE -- name: non_norm_percentiles - type: STRING - mode: NULLABLE - name: total_sample type: BIGNUMERIC mode: NULLABLE diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-glam-prod-fca7/glam_etl/glam_fog_nightly_aggregates_v1/script.sql /tmp/workspace/generated-sql/sql/moz-fx-data-glam-prod-fca7/glam_etl/glam_fog_nightly_aggregates_v1/script.sql --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-glam-prod-fca7/glam_etl/glam_fog_nightly_aggregates_v1/script.sql 2024-06-06 19:47:17.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-glam-prod-fca7/glam_etl/glam_fog_nightly_aggregates_v1/script.sql 2024-06-06 19:47:49.000000000 +0000 @@ -33,8 +33,6 @@ total_users, histogram, percentiles, - non_norm_histogram, - non_norm_percentiles, total_sample ) VALUES @@ -51,8 +49,6 @@ S.total_users, S.histogram, S.percentiles, - S.non_norm_histogram, - S.non_norm_percentiles, S.total_sample ) WHEN MATCHED @@ -61,6 +57,4 @@ SET T.total_users = S.total_users, T.histogram = S.histogram, T.percentiles = S.percentiles, - T.non_norm_histogram = S.non_norm_histogram, - T.non_norm_percentiles = S.non_norm_percentiles, T.total_sample = S.total_sample diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-glam-prod-fca7/glam_etl/glam_fog_release_aggregates/schema.yaml /tmp/workspace/generated-sql/sql/moz-fx-data-glam-prod-fca7/glam_etl/glam_fog_release_aggregates/schema.yaml --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-glam-prod-fca7/glam_etl/glam_fog_release_aggregates/schema.yaml 2024-06-06 19:47:17.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-glam-prod-fca7/glam_etl/glam_fog_release_aggregates/schema.yaml 2024-06-06 19:47:49.000000000 +0000 @@ -41,12 +41,6 @@ - name: percentiles type: STRING mode: NULLABLE -- name: non_norm_histogram - type: STRING - mode: NULLABLE -- name: non_norm_percentiles - type: STRING - mode: NULLABLE - name: total_sample type: BIGNUMERIC mode: NULLABLE diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-glam-prod-fca7/glam_etl/glam_fog_release_aggregates_v1/schema.yaml /tmp/workspace/generated-sql/sql/moz-fx-data-glam-prod-fca7/glam_etl/glam_fog_release_aggregates_v1/schema.yaml --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-glam-prod-fca7/glam_etl/glam_fog_release_aggregates_v1/schema.yaml 2024-06-06 19:47:17.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-glam-prod-fca7/glam_etl/glam_fog_release_aggregates_v1/schema.yaml 2024-06-06 19:47:49.000000000 +0000 @@ -41,12 +41,6 @@ - name: percentiles type: STRING mode: NULLABLE -- name: non_norm_histogram - type: STRING - mode: NULLABLE -- name: non_norm_percentiles - type: STRING - mode: NULLABLE - name: total_sample type: BIGNUMERIC mode: NULLABLE diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-glam-prod-fca7/glam_etl/glam_fog_release_aggregates_v1/script.sql /tmp/workspace/generated-sql/sql/moz-fx-data-glam-prod-fca7/glam_etl/glam_fog_release_aggregates_v1/script.sql --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-glam-prod-fca7/glam_etl/glam_fog_release_aggregates_v1/script.sql 2024-06-06 19:47:17.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-glam-prod-fca7/glam_etl/glam_fog_release_aggregates_v1/script.sql 2024-06-06 19:47:49.000000000 +0000 @@ -33,8 +33,6 @@ total_users, histogram, percentiles, - non_norm_histogram, - non_norm_percentiles, total_sample ) VALUES @@ -51,8 +49,6 @@ S.total_users, S.histogram, S.percentiles, - S.non_norm_histogram, - S.non_norm_percentiles, S.total_sample ) WHEN MATCHED @@ -61,6 +57,4 @@ SET T.total_users = S.total_users, T.histogram = S.histogram, T.percentiles = S.percentiles, - T.non_norm_histogram = S.non_norm_histogram, - T.non_norm_percentiles = S.non_norm_percentiles, T.total_sample = S.total_sample diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/bedrock_derived/event_monitoring_live_v1/materialized_view.sql /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/bedrock_derived/event_monitoring_live_v1/materialized_view.sql --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/bedrock_derived/event_monitoring_live_v1/materialized_view.sql 2024-06-06 19:47:53.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/bedrock_derived/event_monitoring_live_v1/materialized_view.sql 2024-06-06 19:49:31.000000000 +0000 @@ -50,7 +50,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.bedrock_live.interaction_v1` + `moz-fx-data-shared-prod.bedrock_live.events_v1` UNION ALL SELECT submission_timestamp, @@ -60,7 +60,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.bedrock_live.non_interaction_v1` + `moz-fx-data-shared-prod.bedrock_live.interaction_v1` UNION ALL SELECT submission_timestamp, @@ -70,7 +70,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.bedrock_live.events_v1` + `moz-fx-data-shared-prod.bedrock_live.non_interaction_v1` ) CROSS JOIN UNNEST(events) AS event, diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/contextual_services/event_aggregates/schema.yaml /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/contextual_services/event_aggregates/schema.yaml --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/contextual_services/event_aggregates/schema.yaml 2024-06-06 19:47:16.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/contextual_services/event_aggregates/schema.yaml 2024-06-06 19:55:30.000000000 +0000 @@ -1,49 +1,49 @@ fields: -- mode: NULLABLE - name: submission_date +- name: submission_date type: DATE -- mode: NULLABLE - name: source + mode: NULLABLE +- name: source type: STRING -- mode: NULLABLE - name: event_type + mode: NULLABLE +- name: event_type type: STRING -- mode: NULLABLE - name: form_factor + mode: NULLABLE +- name: form_factor type: STRING -- mode: NULLABLE - name: country + mode: NULLABLE +- name: country type: STRING -- mode: NULLABLE - name: subdivision1 + mode: NULLABLE +- name: subdivision1 type: STRING -- mode: NULLABLE - name: advertiser + mode: NULLABLE +- name: advertiser type: STRING -- mode: NULLABLE - name: release_channel + mode: NULLABLE +- name: release_channel type: STRING -- mode: NULLABLE - name: position + mode: NULLABLE +- name: position type: INTEGER -- mode: NULLABLE - name: provider + mode: NULLABLE +- name: provider type: STRING -- mode: NULLABLE - name: match_type + mode: NULLABLE +- name: match_type type: STRING -- mode: NULLABLE - name: normalized_os + mode: NULLABLE +- name: normalized_os type: STRING -- mode: NULLABLE - name: suggest_data_sharing_enabled + mode: NULLABLE +- name: suggest_data_sharing_enabled type: BOOLEAN -- mode: NULLABLE - name: event_count + mode: NULLABLE +- name: event_count type: INTEGER -- mode: NULLABLE - name: user_count + mode: NULLABLE +- name: user_count type: INTEGER -- mode: NULLABLE - name: query_type + mode: NULLABLE +- name: query_type type: STRING + mode: NULLABLE diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/contextual_services/event_aggregates_suggest/schema.yaml /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/contextual_services/event_aggregates_suggest/schema.yaml --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/contextual_services/event_aggregates_suggest/schema.yaml 2024-06-06 19:47:16.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/contextual_services/event_aggregates_suggest/schema.yaml 2024-06-06 19:55:29.000000000 +0000 @@ -1,40 +1,40 @@ fields: -- mode: NULLABLE - name: submission_date +- name: submission_date type: DATE -- mode: NULLABLE - name: form_factor + mode: NULLABLE +- name: form_factor type: STRING -- mode: NULLABLE - name: country + mode: NULLABLE +- name: country type: STRING -- mode: NULLABLE - name: advertiser + mode: NULLABLE +- name: advertiser type: STRING -- mode: NULLABLE - name: normalized_os + mode: NULLABLE +- name: normalized_os type: STRING -- mode: NULLABLE - name: release_channel + mode: NULLABLE +- name: release_channel type: STRING -- mode: NULLABLE - name: position + mode: NULLABLE +- name: position type: INTEGER -- mode: NULLABLE - name: provider + mode: NULLABLE +- name: provider type: STRING -- mode: NULLABLE - name: match_type + mode: NULLABLE +- name: match_type type: STRING -- mode: NULLABLE - name: suggest_data_sharing_enabled + mode: NULLABLE +- name: suggest_data_sharing_enabled type: BOOLEAN -- mode: NULLABLE - name: impression_count + mode: NULLABLE +- name: impression_count type: INTEGER -- mode: NULLABLE - name: click_count + mode: NULLABLE +- name: click_count type: INTEGER -- mode: NULLABLE - name: query_type + mode: NULLABLE +- name: query_type type: STRING + mode: NULLABLE diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/fenix/funnel_retention_clients/schema.yaml /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/fenix/funnel_retention_clients/schema.yaml --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/fenix/funnel_retention_clients/schema.yaml 2024-06-06 19:47:16.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/fenix/funnel_retention_clients/schema.yaml 2024-06-06 19:59:12.000000000 +0000 @@ -26,6 +26,9 @@ - name: adjust_network type: STRING mode: NULLABLE +- name: install_source + type: STRING + mode: NULLABLE - name: retained_week_2 type: BOOLEAN mode: NULLABLE diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/fenix/funnel_retention_week_4/schema.yaml /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/fenix/funnel_retention_week_4/schema.yaml --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/fenix/funnel_retention_week_4/schema.yaml 2024-06-06 19:47:16.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/fenix/funnel_retention_week_4/schema.yaml 2024-06-06 19:59:11.000000000 +0000 @@ -48,6 +48,10 @@ description: 'The type of source of a client installation. ' +- name: install_source + type: STRING + mode: NULLABLE + description: null - name: new_profiles type: INTEGER mode: NULLABLE diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/firefox_desktop_derived/event_monitoring_live_v1/materialized_view.sql /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/firefox_desktop_derived/event_monitoring_live_v1/materialized_view.sql --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/firefox_desktop_derived/event_monitoring_live_v1/materialized_view.sql 2024-06-06 19:47:53.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/firefox_desktop_derived/event_monitoring_live_v1/materialized_view.sql 2024-06-06 19:49:31.000000000 +0000 @@ -50,7 +50,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.firefox_desktop_live.prototype_no_code_events_v1` + `moz-fx-data-shared-prod.firefox_desktop_live.newtab_v1` UNION ALL SELECT submission_timestamp, @@ -60,7 +60,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.firefox_desktop_live.events_v1` + `moz-fx-data-shared-prod.firefox_desktop_live.urlbar_potential_exposure_v1` UNION ALL SELECT submission_timestamp, @@ -70,7 +70,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.firefox_desktop_live.urlbar_potential_exposure_v1` + `moz-fx-data-shared-prod.firefox_desktop_live.prototype_no_code_events_v1` UNION ALL SELECT submission_timestamp, @@ -80,7 +80,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.firefox_desktop_live.newtab_v1` + `moz-fx-data-shared-prod.firefox_desktop_live.events_v1` ) CROSS JOIN UNNEST(events) AS event, diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/firefox_desktop_derived/urlbar_events_v2/query.sql /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/firefox_desktop_derived/urlbar_events_v2/query.sql --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/firefox_desktop_derived/urlbar_events_v2/query.sql 2024-06-06 19:47:53.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/firefox_desktop_derived/urlbar_events_v2/query.sql 2024-06-06 19:47:55.000000000 +0000 @@ -69,13 +69,10 @@ COALESCE(metrics.boolean.urlbar_pref_suggest_nonsponsored, FALSE) AS pref_fx_suggestions, mozfun.map.get_key(extra, "engagement_type") AS engagement_type, mozfun.map.get_key(extra, "interaction") AS interaction, - SAFE_CAST(mozfun.map.get_key(extra, "n_chars") AS int) AS num_chars_typed, - SAFE_CAST(mozfun.map.get_key(extra, "n_results") AS int) AS num_total_results, + CAST(mozfun.map.get_key(extra, "n_chars") AS int) AS num_chars_typed, + CAST(mozfun.map.get_key(extra, "n_results") AS int) AS num_total_results, --If 0, then no result was selected. - NULLIF( - SAFE_CAST(mozfun.map.get_key(extra, "selected_position") AS int), - 0 - ) AS selected_position, + NULLIF(CAST(mozfun.map.get_key(extra, "selected_position") AS int), 0) AS selected_position, mozfun.map.get_key(extra, "selected_result") AS selected_result, enumerated_array( SPLIT(mozfun.map.get_key(extra, "results"), ','), diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/internet_outages/global_outages_v1/schema.yaml /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/internet_outages/global_outages_v1/schema.yaml --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/internet_outages/global_outages_v1/schema.yaml 2024-06-06 19:47:16.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/internet_outages/global_outages_v1/schema.yaml 2024-06-06 19:55:49.000000000 +0000 @@ -1,49 +1,49 @@ fields: -- mode: NULLABLE - name: country +- name: country type: STRING -- mode: NULLABLE - name: city + mode: NULLABLE +- name: city type: STRING -- mode: NULLABLE - name: datetime + mode: NULLABLE +- name: datetime type: TIMESTAMP -- mode: NULLABLE - name: proportion_undefined + mode: NULLABLE +- name: proportion_undefined type: FLOAT -- mode: NULLABLE - name: proportion_timeout + mode: NULLABLE +- name: proportion_timeout type: FLOAT -- mode: NULLABLE - name: proportion_abort + mode: NULLABLE +- name: proportion_abort type: FLOAT -- mode: NULLABLE - name: proportion_unreachable + mode: NULLABLE +- name: proportion_unreachable type: FLOAT -- mode: NULLABLE - name: proportion_terminated + mode: NULLABLE +- name: proportion_terminated type: FLOAT -- mode: NULLABLE - name: proportion_channel_open + mode: NULLABLE +- name: proportion_channel_open type: FLOAT -- mode: NULLABLE - name: avg_dns_success_time + mode: NULLABLE +- name: avg_dns_success_time type: FLOAT -- mode: NULLABLE - name: missing_dns_success + mode: NULLABLE +- name: missing_dns_success type: FLOAT -- mode: NULLABLE - name: avg_dns_failure_time + mode: NULLABLE +- name: avg_dns_failure_time type: FLOAT -- mode: NULLABLE - name: missing_dns_failure + mode: NULLABLE +- name: missing_dns_failure type: FLOAT -- mode: NULLABLE - name: count_dns_failure + mode: NULLABLE +- name: count_dns_failure type: FLOAT -- mode: NULLABLE - name: ssl_error_prop + mode: NULLABLE +- name: ssl_error_prop type: FLOAT -- mode: NULLABLE - name: avg_tls_handshake_time + mode: NULLABLE +- name: avg_tls_handshake_time type: FLOAT + mode: NULLABLE diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/monitoring_derived/event_monitoring_aggregates_v1/query.sql /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/monitoring_derived/event_monitoring_aggregates_v1/query.sql --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/monitoring_derived/event_monitoring_aggregates_v1/query.sql 2024-06-06 19:47:53.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/monitoring_derived/event_monitoring_aggregates_v1/query.sql 2024-06-06 19:51:05.000000000 +0000 @@ -45,7 +45,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.firefox_desktop_stable.prototype_no_code_events_v1` + `moz-fx-data-shared-prod.firefox_desktop_stable.newtab_v1` UNION ALL SELECT submission_timestamp, @@ -55,7 +55,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.firefox_desktop_stable.events_v1` + `moz-fx-data-shared-prod.firefox_desktop_stable.urlbar_potential_exposure_v1` UNION ALL SELECT submission_timestamp, @@ -65,7 +65,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.firefox_desktop_stable.urlbar_potential_exposure_v1` + `moz-fx-data-shared-prod.firefox_desktop_stable.prototype_no_code_events_v1` UNION ALL SELECT submission_timestamp, @@ -75,7 +75,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.firefox_desktop_stable.newtab_v1` + `moz-fx-data-shared-prod.firefox_desktop_stable.events_v1` ) CROSS JOIN UNNEST(events) AS event, @@ -572,7 +572,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.org_mozilla_ios_firefox_stable.first_session_v1` + `moz-fx-data-shared-prod.org_mozilla_ios_firefox_stable.metrics_v1` UNION ALL SELECT submission_timestamp, @@ -582,7 +582,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.org_mozilla_ios_firefox_stable.metrics_v1` + `moz-fx-data-shared-prod.org_mozilla_ios_firefox_stable.events_v1` UNION ALL SELECT submission_timestamp, @@ -592,7 +592,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.org_mozilla_ios_firefox_stable.events_v1` + `moz-fx-data-shared-prod.org_mozilla_ios_firefox_stable.first_session_v1` ) CROSS JOIN UNNEST(events) AS event, @@ -663,7 +663,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.org_mozilla_ios_firefoxbeta_stable.first_session_v1` + `moz-fx-data-shared-prod.org_mozilla_ios_firefoxbeta_stable.metrics_v1` UNION ALL SELECT submission_timestamp, @@ -673,7 +673,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.org_mozilla_ios_firefoxbeta_stable.metrics_v1` + `moz-fx-data-shared-prod.org_mozilla_ios_firefoxbeta_stable.events_v1` UNION ALL SELECT submission_timestamp, @@ -683,7 +683,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.org_mozilla_ios_firefoxbeta_stable.events_v1` + `moz-fx-data-shared-prod.org_mozilla_ios_firefoxbeta_stable.first_session_v1` ) CROSS JOIN UNNEST(events) AS event, @@ -754,7 +754,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.org_mozilla_ios_fennec_stable.first_session_v1` + `moz-fx-data-shared-prod.org_mozilla_ios_fennec_stable.metrics_v1` UNION ALL SELECT submission_timestamp, @@ -764,7 +764,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.org_mozilla_ios_fennec_stable.metrics_v1` + `moz-fx-data-shared-prod.org_mozilla_ios_fennec_stable.events_v1` UNION ALL SELECT submission_timestamp, @@ -774,7 +774,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.org_mozilla_ios_fennec_stable.events_v1` + `moz-fx-data-shared-prod.org_mozilla_ios_fennec_stable.first_session_v1` ) CROSS JOIN UNNEST(events) AS event, @@ -1990,7 +1990,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.bedrock_stable.interaction_v1` + `moz-fx-data-shared-prod.bedrock_stable.events_v1` UNION ALL SELECT submission_timestamp, @@ -2000,7 +2000,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.bedrock_stable.non_interaction_v1` + `moz-fx-data-shared-prod.bedrock_stable.interaction_v1` UNION ALL SELECT submission_timestamp, @@ -2010,7 +2010,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.bedrock_stable.events_v1` + `moz-fx-data-shared-prod.bedrock_stable.non_interaction_v1` ) CROSS JOIN UNNEST(events) AS event, @@ -2081,7 +2081,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.viu_politica_stable.main_events_v1` + `moz-fx-data-shared-prod.viu_politica_stable.video_index_v1` UNION ALL SELECT submission_timestamp, @@ -2091,7 +2091,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.viu_politica_stable.video_index_v1` + `moz-fx-data-shared-prod.viu_politica_stable.main_events_v1` ) CROSS JOIN UNNEST(events) AS event, diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/org_mozilla_fenix/geckoview_version/schema.yaml /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/org_mozilla_fenix/geckoview_version/schema.yaml --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/org_mozilla_fenix/geckoview_version/schema.yaml 2024-06-06 19:47:16.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/org_mozilla_fenix/geckoview_version/schema.yaml 2024-06-06 19:56:04.000000000 +0000 @@ -1,7 +1,13 @@ fields: -- type: DATETIME - name: build_hour -- type: INTEGER - name: geckoview_major_version -- type: INTEGER - name: n_pings +- name: build_hour + type: DATETIME + mode: NULLABLE + description: null +- name: geckoview_major_version + type: INTEGER + mode: NULLABLE + description: null +- name: n_pings + type: INTEGER + mode: NULLABLE + description: null diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/org_mozilla_ios_fennec_derived/event_monitoring_live_v1/materialized_view.sql /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/org_mozilla_ios_fennec_derived/event_monitoring_live_v1/materialized_view.sql --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/org_mozilla_ios_fennec_derived/event_monitoring_live_v1/materialized_view.sql 2024-06-06 19:47:53.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/org_mozilla_ios_fennec_derived/event_monitoring_live_v1/materialized_view.sql 2024-06-06 19:49:33.000000000 +0000 @@ -50,7 +50,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.org_mozilla_ios_fennec_live.first_session_v1` + `moz-fx-data-shared-prod.org_mozilla_ios_fennec_live.metrics_v1` UNION ALL SELECT submission_timestamp, @@ -60,7 +60,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.org_mozilla_ios_fennec_live.metrics_v1` + `moz-fx-data-shared-prod.org_mozilla_ios_fennec_live.events_v1` UNION ALL SELECT submission_timestamp, @@ -70,7 +70,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.org_mozilla_ios_fennec_live.events_v1` + `moz-fx-data-shared-prod.org_mozilla_ios_fennec_live.first_session_v1` ) CROSS JOIN UNNEST(events) AS event, diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/org_mozilla_ios_firefoxbeta_derived/event_monitoring_live_v1/materialized_view.sql /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/org_mozilla_ios_firefoxbeta_derived/event_monitoring_live_v1/materialized_view.sql --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/org_mozilla_ios_firefoxbeta_derived/event_monitoring_live_v1/materialized_view.sql 2024-06-06 19:47:53.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/org_mozilla_ios_firefoxbeta_derived/event_monitoring_live_v1/materialized_view.sql 2024-06-06 19:49:33.000000000 +0000 @@ -50,7 +50,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.org_mozilla_ios_firefoxbeta_live.first_session_v1` + `moz-fx-data-shared-prod.org_mozilla_ios_firefoxbeta_live.metrics_v1` UNION ALL SELECT submission_timestamp, @@ -60,7 +60,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.org_mozilla_ios_firefoxbeta_live.metrics_v1` + `moz-fx-data-shared-prod.org_mozilla_ios_firefoxbeta_live.events_v1` UNION ALL SELECT submission_timestamp, @@ -70,7 +70,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.org_mozilla_ios_firefoxbeta_live.events_v1` + `moz-fx-data-shared-prod.org_mozilla_ios_firefoxbeta_live.first_session_v1` ) CROSS JOIN UNNEST(events) AS event, diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/org_mozilla_ios_firefox_derived/event_monitoring_live_v1/materialized_view.sql /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/org_mozilla_ios_firefox_derived/event_monitoring_live_v1/materialized_view.sql --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/org_mozilla_ios_firefox_derived/event_monitoring_live_v1/materialized_view.sql 2024-06-06 19:47:53.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/org_mozilla_ios_firefox_derived/event_monitoring_live_v1/materialized_view.sql 2024-06-06 19:49:33.000000000 +0000 @@ -50,7 +50,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.org_mozilla_ios_firefox_live.first_session_v1` + `moz-fx-data-shared-prod.org_mozilla_ios_firefox_live.metrics_v1` UNION ALL SELECT submission_timestamp, @@ -60,7 +60,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.org_mozilla_ios_firefox_live.metrics_v1` + `moz-fx-data-shared-prod.org_mozilla_ios_firefox_live.events_v1` UNION ALL SELECT submission_timestamp, @@ -70,7 +70,7 @@ client_info.app_display_version AS version, ping_info FROM - `moz-fx-data-shared-prod.org_mozilla_ios_firefox_live.events_v1` + `moz-fx-data-shared-prod.org_mozilla_ios_firefox_live.first_session_v1` ) CROSS JOIN UNNEST(events) AS event, diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/static/ca_postal_districts_v1/schema.yaml /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/static/ca_postal_districts_v1/schema.yaml --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/static/ca_postal_districts_v1/schema.yaml 2024-06-06 19:47:16.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/static/ca_postal_districts_v1/schema.yaml 2024-06-06 19:56:13.000000000 +0000 @@ -1,9 +1,7 @@ fields: - name: postal_district_code type: STRING - mode: REQUIRED - description: One-character Canadian postal district code. + mode: NULLABLE - name: province_code type: STRING mode: NULLABLE - description: Two-character Canadian province/territory code (if any). diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/static/country_codes_v1/schema.yaml /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/static/country_codes_v1/schema.yaml --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/static/country_codes_v1/schema.yaml 2024-06-06 19:47:16.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/static/country_codes_v1/schema.yaml 2024-06-06 19:56:13.000000000 +0000 @@ -1,47 +1,28 @@ fields: - name: name - description: Official country name per ISO 3166 type: STRING - mode: REQUIRED + mode: NULLABLE - name: code - description: ISO 3166 alpha-2 country code type: STRING - mode: REQUIRED + mode: NULLABLE - name: code_3 - description: ISO 3166 alpha-3 country code type: STRING - mode: REQUIRED + mode: NULLABLE - name: region_name - description: Region name. These are based on the UN Statistics Division standard - country or area codes for statistical use (M49), but with the "Americas" region - split into "North America" and "South America". type: STRING - mode: REQUIRED + mode: NULLABLE - name: subregion_name - description: Sub-region name. These are based on UN Statistics Division standard - country or area codes for statistical use (M49), but with the "Latin America and the - Caribbean" and "Sub-Saharan Africa" sub-regions split into more specific - sub-regions. type: STRING - mode: REQUIRED + mode: NULLABLE - name: pocket_available_on_newtab - description: Whether Pocket is available on the newtab page in this country. Note - that Pocket might only be available in certain locales/languages within a country. - type: BOOL - mode: REQUIRED + type: BOOLEAN + mode: NULLABLE - name: mozilla_vpn_available - description: Whether Mozilla VPN is available in this country. - type: BOOL - mode: REQUIRED + type: BOOLEAN + mode: NULLABLE - name: sponsored_tiles_available_on_newtab - description: Whether sponsored tiles are available on the newtab page in this country. - Note that Pocket might only be available in certain locales/languages within a - country. - type: BOOL - mode: REQUIRED + type: BOOLEAN + mode: NULLABLE - name: ads_value_tier - description: Lowercase label detailing the monetary value tier that Mozilla Ads - assign to that region based on market size and our existing products, e.g., tier - 1, tier 2, etc. type: STRING - mode: REQUIRED + mode: NULLABLE diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/static/country_names_v1/schema.yaml /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/static/country_names_v1/schema.yaml --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/static/country_names_v1/schema.yaml 2024-06-06 19:47:16.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/static/country_names_v1/schema.yaml 2024-06-06 19:56:13.000000000 +0000 @@ -1,10 +1,7 @@ fields: - name: name - description: An alias for a country's name (including misspellings and alternate - encodings). type: STRING - mode: REQUIRED + mode: NULLABLE - name: code - description: ISO 3166 alpha-2 country code type: STRING - mode: REQUIRED + mode: NULLABLE diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/static/data_incidents_v1/schema.yaml /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/static/data_incidents_v1/schema.yaml --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/static/data_incidents_v1/schema.yaml 2024-06-06 19:47:16.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/static/data_incidents_v1/schema.yaml 2024-06-06 19:56:13.000000000 +0000 @@ -1,22 +1,22 @@ fields: -- mode: NULLABLE - name: start_date +- name: start_date type: DATE -- mode: NULLABLE - name: end_date + mode: NULLABLE +- name: end_date type: DATE -- mode: NULLABLE - name: incident + mode: NULLABLE +- name: incident type: STRING -- mode: NULLABLE - name: description + mode: NULLABLE +- name: description type: STRING -- mode: NULLABLE - name: bug + mode: NULLABLE +- name: bug type: STRING -- mode: NULLABLE - name: product + mode: NULLABLE +- name: product type: STRING -- mode: NULLABLE - name: version + mode: NULLABLE +- name: version type: STRING + mode: NULLABLE diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/static/iana_tls_cipher_suites/schema.yaml /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/static/iana_tls_cipher_suites/schema.yaml --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/static/iana_tls_cipher_suites/schema.yaml 2024-06-06 19:47:16.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/static/iana_tls_cipher_suites/schema.yaml 2024-06-06 19:56:13.000000000 +0000 @@ -1,27 +1,16 @@ fields: -- mode: NULLABLE - description: Hex value assigned to the TLS cipher, in format like "0x00,0x84"; note - some values are ranges or contain wildcards - name: value +- name: value type: STRING -- mode: NULLABLE - description: Human-readable name of the TLS cipher - name: description + mode: NULLABLE +- name: description type: STRING -- mode: NULLABLE - description: Any TLS cipher suite that is specified for use with DTLS MUST define - limits on the use of the associated AEAD function that preserves margins for both - confidentiality and integrity, as specified in [RFC-ietf-tls-dtls13-43] - name: dtls_ok + mode: NULLABLE +- name: dtls_ok type: BOOLEAN -- mode: NULLABLE - description: Whether the TLS cipher is recommended by the IETF. If an item is not - marked as "recommended", it does not necessarily mean that it is flawed; rather, - it indicates that the item either has not been through the IETF consensus process, - has limited applicability, or is intended only for specific use cases - name: recommended + mode: NULLABLE +- name: recommended type: BOOLEAN -- mode: NULLABLE - description: RFCs or associated reference material for the TLS cipher - name: reference + mode: NULLABLE +- name: reference type: STRING + mode: NULLABLE diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/static/language_codes_v1/schema.yaml /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/static/language_codes_v1/schema.yaml --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/static/language_codes_v1/schema.yaml 2024-06-06 19:47:16.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/static/language_codes_v1/schema.yaml 2024-06-06 19:56:14.000000000 +0000 @@ -1,17 +1,13 @@ fields: - name: code_3 - description: ISO 639 alpha-3 language code. type: STRING - mode: REQUIRED + mode: NULLABLE - name: code_2 - description: ISO 639 alpha-2 language code (if any). type: STRING mode: NULLABLE - name: name - description: Language name. type: STRING - mode: REQUIRED + mode: NULLABLE - name: other_names - description: Other names for the language (if any). type: STRING mode: NULLABLE diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/static/monitoring_distinct_docids_notes_v1/schema.yaml /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/static/monitoring_distinct_docids_notes_v1/schema.yaml --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/static/monitoring_distinct_docids_notes_v1/schema.yaml 2024-06-06 19:47:16.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/static/monitoring_distinct_docids_notes_v1/schema.yaml 2024-06-06 19:56:13.000000000 +0000 @@ -1,19 +1,19 @@ fields: -- mode: NULLABLE - name: start_date +- name: start_date type: DATE -- mode: NULLABLE - name: end_date + mode: NULLABLE +- name: end_date type: DATE -- mode: NULLABLE - name: document_namespace + mode: NULLABLE +- name: document_namespace type: STRING -- mode: NULLABLE - name: document_type + mode: NULLABLE +- name: document_type type: STRING -- mode: NULLABLE - name: notes + mode: NULLABLE +- name: notes type: STRING -- mode: NULLABLE - name: bug + mode: NULLABLE +- name: bug type: STRING + mode: NULLABLE diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/static/monitoring_missing_columns_notes_v1/schema.yaml /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/static/monitoring_missing_columns_notes_v1/schema.yaml --- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/static/monitoring_missing_columns_notes_v1/schema.yaml 2024-06-06 19:47:16.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/static/monitoring_missing_columns_notes_v1/schema.yaml 2024-06-06 19:56:13.000000000 +0000 @@ -1,25 +1,25 @@ fields: -- mode: NULLABLE - name: start_date +- name: start_date type: DATE -- mode: NULLABLE - name: end_date + mode: NULLABLE +- name: end_date type: DATE -- mode: NULLABLE - name: document_namespace + mode: NULLABLE +- name: document_namespace type: STRING -- mode: NULLABLE - name: document_type + mode: NULLABLE +- name: document_type type: STR ```

⚠️ Only part of the diff is displayed.

Link to full diff

edugfilho commented 1 month ago

The tests have been fixed and I tested the code with a number of instances from real data by using expressions such as assert.equal(<original_udf_in_js>(input), <new_udf_in_sql>(input))

dataops-ci-bot commented 1 month ago

Integration report for "Merge branch 'main' into glam-js-to-sql"

sql.diff

Click to expand! ```diff diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/mozfun/glam/histogram_cast_json/udf.sql /tmp/workspace/generated-sql/sql/mozfun/glam/histogram_cast_json/udf.sql --- /tmp/workspace/main-generated-sql/sql/mozfun/glam/histogram_cast_json/udf.sql 2024-06-07 13:12:26.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/mozfun/glam/histogram_cast_json/udf.sql 2024-06-07 13:12:35.000000000 +0000 @@ -1,17 +1,27 @@ --- udf_js_flatten +/* +Casts an ARRAY> histogram to a JSON string. +This implementation uses String concatenation instead of +BigQuery native JSON (TO_JSON / JSON_OBJECT) functions to +preserve order. +https://cloud.google.com/bigquery/docs/reference/standard-sql/json_functions#json_object +Order is important for GLAM histograms so other UDFs that +operate on them, such as glam.percentile, can work correctly. +*/ CREATE OR REPLACE FUNCTION glam.histogram_cast_json( histogram ARRAY> ) -RETURNS STRING DETERMINISTIC -LANGUAGE js -AS - ''' - let obj = {}; - histogram.map(r => { - obj[r.key] = parseFloat(r.value.toFixed(4)); - }); - return JSON.stringify(obj); -'''; +RETURNS STRING AS ( + ( + SELECT + CONCAT( + '{', + STRING_AGG(CONCAT('"', key, '":', ROUND(value, 4)) ORDER BY CAST(key AS FLOAT64)), + '}' + ) + FROM + UNNEST(histogram) + ) +); SELECT assert.equals( @@ -19,4 +29,16 @@ glam.histogram_cast_json( ARRAY>[("0", 0.111111), ("1", 2.0 / 3), ("2", 0)] ) + ), + assert.equals( + '{"0":0.1111,"1":0.6667,"2":0,"10":100}', + glam.histogram_cast_json( + ARRAY>[ + ("0", 0.111111), + ("1", 2.0 / 3), + ("10", 100), + ("2", 0) + ] ) + ), + assert.equals('{}', glam.histogram_cast_json(ARRAY>[])), diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/mozfun/glam/histogram_generate_functional_buckets/udf.sql /tmp/workspace/generated-sql/sql/mozfun/glam/histogram_generate_functional_buckets/udf.sql --- /tmp/workspace/main-generated-sql/sql/mozfun/glam/histogram_generate_functional_buckets/udf.sql 2024-06-07 13:12:26.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/mozfun/glam/histogram_generate_functional_buckets/udf.sql 2024-06-07 13:12:35.000000000 +0000 @@ -4,34 +4,29 @@ buckets_per_magnitude INT64, range_max INT64 ) -RETURNS ARRAY DETERMINISTIC -LANGUAGE js -AS - ''' - function sample_to_bucket_index(sample) { - // Get the index of the sample - // https://github.com/mozilla/glean/blob/main/glean-core/src/histogram/functional.rs - let exponent = Math.pow(log_base, 1.0/buckets_per_magnitude); - return Math.ceil(Math.log(sample + 1) / Math.log(exponent)); - } - - let buckets = new Set([0]); - for (let index = 0; index < sample_to_bucket_index(range_max); index++) { - - // Avoid re-using the exponent due to floating point issues when carrying - // the `pow` operation e.g. `let exponent = ...; Math.pow(exponent, index)`. - let bucket = Math.floor(Math.pow(log_base, index/buckets_per_magnitude)); - - // NOTE: the sample_to_bucket_index implementation overshoots the true index, - // so we break out early if we hit the max bucket range. - if (bucket > range_max) { - break; - } - buckets.add(bucket); - } - - return [...buckets] -'''; +RETURNS ARRAY AS ( + ( + WITH bucket_indexes AS ( + -- Generate all bucket indexes + -- https://github.com/mozilla/glean/blob/main/glean-core/src/histogram/functional.rs + SELECT + GENERATE_ARRAY(0, CEIL(LOG(range_max + 1, log_base) * buckets_per_magnitude)) AS indexes + ), + buckets AS ( + SELECT + FLOOR(POW(log_base, (idx) / buckets_per_magnitude)) AS bucket + FROM + bucket_indexes, + UNNEST(indexes) AS idx + WHERE + FLOOR(POW(log_base, (idx) / buckets_per_magnitude)) <= range_max + ) + SELECT + ARRAY_CONCAT([0.0], ARRAY_AGG(DISTINCT(bucket) ORDER BY bucket)) + FROM + buckets + ) +); SELECT -- First 50 keys of a timing distribution diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/mozfun/glam/histogram_generate_linear_buckets/udf.sql /tmp/workspace/generated-sql/sql/mozfun/glam/histogram_generate_linear_buckets/udf.sql --- /tmp/workspace/main-generated-sql/sql/mozfun/glam/histogram_generate_linear_buckets/udf.sql 2024-06-07 13:12:26.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/mozfun/glam/histogram_generate_linear_buckets/udf.sql 2024-06-07 13:12:35.000000000 +0000 @@ -4,17 +4,17 @@ max FLOAT64, nBuckets FLOAT64 ) -RETURNS ARRAY DETERMINISTIC -LANGUAGE js -AS - ''' - let result = [0]; - for (let i = 1; i < Math.min(nBuckets, max, 10000); i++) { - let linearRange = (min * (nBuckets - 1 - i) + max * (i - 1)) / (nBuckets - 2); - result.push(Math.round(linearRange)); - } - return result; -'''; +RETURNS ARRAY AS ( + ARRAY_CONCAT( + [0.0], + ARRAY( + SELECT + ROUND((min * (nBuckets - 1 - i) + max * (i - 1)) / (nBuckets - 2)) + FROM + UNNEST(GENERATE_ARRAY(1, LEAST(nBuckets - 1, max, 10000))) AS i + ) + ) +); SELECT -- Buckets of CONTENT_FRAME_TIME_VSYNC diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/mozfun/glam/histogram_generate_scalar_buckets/udf.sql /tmp/workspace/generated-sql/sql/mozfun/glam/histogram_generate_scalar_buckets/udf.sql --- /tmp/workspace/main-generated-sql/sql/mozfun/glam/histogram_generate_scalar_buckets/udf.sql 2024-06-07 13:12:26.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/mozfun/glam/histogram_generate_scalar_buckets/udf.sql 2024-06-07 13:12:35.000000000 +0000 @@ -4,21 +4,23 @@ max_bucket FLOAT64, num_buckets INT64 ) -RETURNS ARRAY DETERMINISTIC -LANGUAGE js -AS - ''' - let bucket_size = (max_bucket - min_bucket) / num_buckets; - let buckets = new Set(); - for (let bucket = min_bucket; bucket < max_bucket; bucket += bucket_size) { - buckets.add(Math.pow(2, bucket).toFixed(2)); - } - return Array.from(buckets); -'''; +RETURNS ARRAY AS ( + IF( + min_bucket >= max_bucket, + [], + ARRAY( + SELECT + ROUND(POW(2, (max_bucket - min_bucket) / num_buckets * val), 2) + FROM + UNNEST(GENERATE_ARRAY(0, num_buckets - 1)) AS val + ) + ) +); SELECT assert.array_equals([1, 2, 4, 8], glam.histogram_generate_scalar_buckets(0, LOG(16, 2), 4)), assert.array_equals( [1, 1.9, 3.62, 6.9, 13.13], glam.histogram_generate_scalar_buckets(0, LOG(25, 2), 5) - ) + ), + assert.array_equals([], glam.histogram_generate_scalar_buckets(10, 10, 100)) diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/mozfun/glam/percentile/udf.sql /tmp/workspace/generated-sql/sql/mozfun/glam/percentile/udf.sql --- /tmp/workspace/main-generated-sql/sql/mozfun/glam/percentile/udf.sql 2024-06-07 13:12:26.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/mozfun/glam/percentile/udf.sql 2024-06-07 13:12:35.000000000 +0000 @@ -1,37 +1,41 @@ -- udf_js.glean_percentile CREATE OR REPLACE FUNCTION glam.percentile( - percentile FLOAT64, + pct FLOAT64, histogram ARRAY>, type STRING ) -RETURNS FLOAT64 DETERMINISTIC -LANGUAGE js -AS - ''' - if (percentile < 0 || percentile > 100) { - throw "percentile must be a value between 0 and 100"; - } - - let values = histogram.map(bucket => bucket.value); - let total = values.reduce((a, b) => a + b); - let normalized = values.map(value => value / total); - - // Find the index into the cumulative distribution function that corresponds - // to the percentile. This undershoots the true value of the percentile. - let acc = 0; - let index = null; - for (let i = 0; i < normalized.length; i++) { - acc += normalized[i]; - index = i; - if (acc >= percentile / 100) { - break; - } - } - - // NOTE: we do not perform geometric or linear interpolation, but this would - // be the place to implement it. - return histogram[index].key; -'''; +RETURNS FLOAT64 AS ( + ( + WITH check AS ( + SELECT + IF( + pct >= 0 + AND pct <= 100, + TRUE, + ERROR('percentile must be a value between 0 and 100') + ) pct_ok + ), + keyed_cum_sum AS ( + SELECT + key, + SUM(value) OVER (ORDER BY CAST(key AS FLOAT64)) / SUM(value) OVER () AS cum_sum + FROM + UNNEST(histogram) + ) + SELECT + CAST(key AS FLOAT64) + FROM + keyed_cum_sum, + check + WHERE + check.pct_ok + AND cum_sum >= pct / 100 + ORDER BY + cum_sum + LIMIT + 1 + ) +); SELECT assert.equals( @@ -41,6 +45,30 @@ ARRAY>[("0", 1), ("2", 2), ("3", 1)], "timing_distribution" ) + ), + assert.equals( + 3, + glam.percentile( + 100.0, + ARRAY>[("0", 1), ("2", 2), ("3", 1)], + "timing_distribution" + ) + ), + assert.equals( + 0, + glam.percentile( + 0.0, + ARRAY>[("0", 1), ("2", 2), ("3", 1)], + "timing_distribution" + ) + ), + assert.equals( + 2, + glam.percentile( + 2.0, + ARRAY>[("0", 1), ("2", 2), ("10", 10), ("11", 100)], + "timing_distribution" + ) ); #xfail ```

Link to full diff

dataops-ci-bot commented 2 weeks ago

Integration report for "Merge branch 'main' into glam-js-to-sql"

sql.diff

Click to expand! ```diff diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/mozfun/glam/histogram_cast_json/udf.sql /tmp/workspace/generated-sql/sql/mozfun/glam/histogram_cast_json/udf.sql --- /tmp/workspace/main-generated-sql/sql/mozfun/glam/histogram_cast_json/udf.sql 2024-06-25 15:00:40.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/mozfun/glam/histogram_cast_json/udf.sql 2024-06-25 15:00:39.000000000 +0000 @@ -1,17 +1,27 @@ --- udf_js_flatten +/* +Casts an ARRAY> histogram to a JSON string. +This implementation uses String concatenation instead of +BigQuery native JSON (TO_JSON / JSON_OBJECT) functions to +preserve order. +https://cloud.google.com/bigquery/docs/reference/standard-sql/json_functions#json_object +Order is important for GLAM histograms so other UDFs that +operate on them, such as glam.percentile, can work correctly. +*/ CREATE OR REPLACE FUNCTION glam.histogram_cast_json( histogram ARRAY> ) -RETURNS STRING DETERMINISTIC -LANGUAGE js -AS - ''' - let obj = {}; - histogram.map(r => { - obj[r.key] = parseFloat(r.value.toFixed(4)); - }); - return JSON.stringify(obj); -'''; +RETURNS STRING AS ( + ( + SELECT + CONCAT( + '{', + STRING_AGG(CONCAT('"', key, '":', ROUND(value, 4)) ORDER BY CAST(key AS FLOAT64)), + '}' + ) + FROM + UNNEST(histogram) + ) +); SELECT assert.equals( @@ -19,4 +29,16 @@ glam.histogram_cast_json( ARRAY>[("0", 0.111111), ("1", 2.0 / 3), ("2", 0)] ) + ), + assert.equals( + '{"0":0.1111,"1":0.6667,"2":0,"10":100}', + glam.histogram_cast_json( + ARRAY>[ + ("0", 0.111111), + ("1", 2.0 / 3), + ("10", 100), + ("2", 0) + ] ) + ), + assert.equals('{}', glam.histogram_cast_json(ARRAY>[])), diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/mozfun/glam/histogram_generate_functional_buckets/udf.sql /tmp/workspace/generated-sql/sql/mozfun/glam/histogram_generate_functional_buckets/udf.sql --- /tmp/workspace/main-generated-sql/sql/mozfun/glam/histogram_generate_functional_buckets/udf.sql 2024-06-25 15:00:40.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/mozfun/glam/histogram_generate_functional_buckets/udf.sql 2024-06-25 15:00:39.000000000 +0000 @@ -4,34 +4,29 @@ buckets_per_magnitude INT64, range_max INT64 ) -RETURNS ARRAY DETERMINISTIC -LANGUAGE js -AS - ''' - function sample_to_bucket_index(sample) { - // Get the index of the sample - // https://github.com/mozilla/glean/blob/main/glean-core/src/histogram/functional.rs - let exponent = Math.pow(log_base, 1.0/buckets_per_magnitude); - return Math.ceil(Math.log(sample + 1) / Math.log(exponent)); - } - - let buckets = new Set([0]); - for (let index = 0; index < sample_to_bucket_index(range_max); index++) { - - // Avoid re-using the exponent due to floating point issues when carrying - // the `pow` operation e.g. `let exponent = ...; Math.pow(exponent, index)`. - let bucket = Math.floor(Math.pow(log_base, index/buckets_per_magnitude)); - - // NOTE: the sample_to_bucket_index implementation overshoots the true index, - // so we break out early if we hit the max bucket range. - if (bucket > range_max) { - break; - } - buckets.add(bucket); - } - - return [...buckets] -'''; +RETURNS ARRAY AS ( + ( + WITH bucket_indexes AS ( + -- Generate all bucket indexes + -- https://github.com/mozilla/glean/blob/main/glean-core/src/histogram/functional.rs + SELECT + GENERATE_ARRAY(0, CEIL(LOG(range_max + 1, log_base) * buckets_per_magnitude)) AS indexes + ), + buckets AS ( + SELECT + FLOOR(POW(log_base, (idx) / buckets_per_magnitude)) AS bucket + FROM + bucket_indexes, + UNNEST(indexes) AS idx + WHERE + FLOOR(POW(log_base, (idx) / buckets_per_magnitude)) <= range_max + ) + SELECT + ARRAY_CONCAT([0.0], ARRAY_AGG(DISTINCT(bucket) ORDER BY bucket)) + FROM + buckets + ) +); SELECT -- First 50 keys of a timing distribution diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/mozfun/glam/histogram_generate_linear_buckets/udf.sql /tmp/workspace/generated-sql/sql/mozfun/glam/histogram_generate_linear_buckets/udf.sql --- /tmp/workspace/main-generated-sql/sql/mozfun/glam/histogram_generate_linear_buckets/udf.sql 2024-06-25 15:00:40.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/mozfun/glam/histogram_generate_linear_buckets/udf.sql 2024-06-25 15:00:39.000000000 +0000 @@ -4,17 +4,17 @@ max FLOAT64, nBuckets FLOAT64 ) -RETURNS ARRAY DETERMINISTIC -LANGUAGE js -AS - ''' - let result = [0]; - for (let i = 1; i < Math.min(nBuckets, max, 10000); i++) { - let linearRange = (min * (nBuckets - 1 - i) + max * (i - 1)) / (nBuckets - 2); - result.push(Math.round(linearRange)); - } - return result; -'''; +RETURNS ARRAY AS ( + ARRAY_CONCAT( + [0.0], + ARRAY( + SELECT + ROUND((min * (nBuckets - 1 - i) + max * (i - 1)) / (nBuckets - 2)) + FROM + UNNEST(GENERATE_ARRAY(1, LEAST(nBuckets - 1, max, 10000))) AS i + ) + ) +); SELECT -- Buckets of CONTENT_FRAME_TIME_VSYNC diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/mozfun/glam/histogram_generate_scalar_buckets/udf.sql /tmp/workspace/generated-sql/sql/mozfun/glam/histogram_generate_scalar_buckets/udf.sql --- /tmp/workspace/main-generated-sql/sql/mozfun/glam/histogram_generate_scalar_buckets/udf.sql 2024-06-25 15:00:40.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/mozfun/glam/histogram_generate_scalar_buckets/udf.sql 2024-06-25 15:00:39.000000000 +0000 @@ -4,21 +4,23 @@ max_bucket FLOAT64, num_buckets INT64 ) -RETURNS ARRAY DETERMINISTIC -LANGUAGE js -AS - ''' - let bucket_size = (max_bucket - min_bucket) / num_buckets; - let buckets = new Set(); - for (let bucket = min_bucket; bucket < max_bucket; bucket += bucket_size) { - buckets.add(Math.pow(2, bucket).toFixed(2)); - } - return Array.from(buckets); -'''; +RETURNS ARRAY AS ( + IF( + min_bucket >= max_bucket, + [], + ARRAY( + SELECT + ROUND(POW(2, (max_bucket - min_bucket) / num_buckets * val), 2) + FROM + UNNEST(GENERATE_ARRAY(0, num_buckets - 1)) AS val + ) + ) +); SELECT assert.array_equals([1, 2, 4, 8], glam.histogram_generate_scalar_buckets(0, LOG(16, 2), 4)), assert.array_equals( [1, 1.9, 3.62, 6.9, 13.13], glam.histogram_generate_scalar_buckets(0, LOG(25, 2), 5) - ) + ), + assert.array_equals([], glam.histogram_generate_scalar_buckets(10, 10, 100)) diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/mozfun/glam/percentile/udf.sql /tmp/workspace/generated-sql/sql/mozfun/glam/percentile/udf.sql --- /tmp/workspace/main-generated-sql/sql/mozfun/glam/percentile/udf.sql 2024-06-25 15:00:40.000000000 +0000 +++ /tmp/workspace/generated-sql/sql/mozfun/glam/percentile/udf.sql 2024-06-25 15:00:39.000000000 +0000 @@ -1,37 +1,41 @@ -- udf_js.glean_percentile CREATE OR REPLACE FUNCTION glam.percentile( - percentile FLOAT64, + pct FLOAT64, histogram ARRAY>, type STRING ) -RETURNS FLOAT64 DETERMINISTIC -LANGUAGE js -AS - ''' - if (percentile < 0 || percentile > 100) { - throw "percentile must be a value between 0 and 100"; - } - - let values = histogram.map(bucket => bucket.value); - let total = values.reduce((a, b) => a + b); - let normalized = values.map(value => value / total); - - // Find the index into the cumulative distribution function that corresponds - // to the percentile. This undershoots the true value of the percentile. - let acc = 0; - let index = null; - for (let i = 0; i < normalized.length; i++) { - acc += normalized[i]; - index = i; - if (acc >= percentile / 100) { - break; - } - } - - // NOTE: we do not perform geometric or linear interpolation, but this would - // be the place to implement it. - return histogram[index].key; -'''; +RETURNS FLOAT64 AS ( + ( + WITH check AS ( + SELECT + IF( + pct >= 0 + AND pct <= 100, + TRUE, + ERROR('percentile must be a value between 0 and 100') + ) pct_ok + ), + keyed_cum_sum AS ( + SELECT + key, + SUM(value) OVER (ORDER BY CAST(key AS FLOAT64)) / SUM(value) OVER () AS cum_sum + FROM + UNNEST(histogram) + ) + SELECT + CAST(key AS FLOAT64) + FROM + keyed_cum_sum, + check + WHERE + check.pct_ok + AND cum_sum >= pct / 100 + ORDER BY + cum_sum + LIMIT + 1 + ) +); SELECT assert.equals( @@ -41,6 +45,30 @@ ARRAY>[("0", 1), ("2", 2), ("3", 1)], "timing_distribution" ) + ), + assert.equals( + 3, + glam.percentile( + 100.0, + ARRAY>[("0", 1), ("2", 2), ("3", 1)], + "timing_distribution" + ) + ), + assert.equals( + 0, + glam.percentile( + 0.0, + ARRAY>[("0", 1), ("2", 2), ("3", 1)], + "timing_distribution" + ) + ), + assert.equals( + 2, + glam.percentile( + 2.0, + ARRAY>[("0", 1), ("2", 2), ("10", 10), ("11", 100)], + "timing_distribution" + ) ); #xfail ```

Link to full diff