mozilla / gcp-ingestion

Documentation and implementation of telemetry ingestion on Google Cloud Platform
https://mozilla.github.io/gcp-ingestion/
Mozilla Public License 2.0
79 stars 32 forks source link

DENG-1352 - Add support for parsing FoG pings in contextual services reporter #2466

Closed relud closed 11 months ago

relud commented 1 year ago

for verification three batch jobs have been run against a sample of prod data for a full day with reportingEnabaled=false and logReportingUrls=true. The first was without this code change and only read contextual_services messages. The second included this code change and also only read contextual_services messages. The third included this code change, read contextual_services and firefox_desktop messages, and had limitLegacyDesktopVersion=true.

By comparing the first two jobs verified that this change will have no impact until the new doctypes are sent to it and we update the configuration. This means that this PR is safe to merge, and publishing the new doctypes to the pubsub topic is safe, and once both of those are done we can deploy the configuration changes to switch to the new logic.

By comparing the second and third jobs' number of SendRequest "errors" and extracting aggregate impression counts, I was able to validate the impact of this change aligns with our expectations of slight increase (~1.3%) due to better delivery guarantees in firefox_desktop than contextual_services.

Other common errors were within expected limits of ~1.3% or less change. For less common errors: RejectedMessageException: Firefox version does not match doctype went away as expected because that's the intended impact of limitLegacyDesktopVersion=true. InvalidUrlException: Could not parse reporting URL saw a 28% increase from ~9k to ~11.5k, which is big relative to eachother, but small overall.

One noteworthy change from validation is that RejectedMessageException: Matches heuristic from CONSVC-1764 applies across firefox_desktop.top_sites where before it only applied to contextual_services.topsites_click before, and this resulted in a tenfold increase in occurence (from ~2k to ~20k). Inspecting this more closely I found that the number of impacted top sites clicks increased by 1.4% while the rest is attributed to newly rejecting top sites impressions.

The other noteworthy change from validation is that I saw significant performance degredation from restricting user_agent_version in VerifyMetadata, sometimes causing pipeline failures. As such, I moved this action into FilterByDocType where the messages can be dropped entirely instead of sent to the error output, which resolved the issue.

codecov-commenter commented 1 year ago

Codecov Report

Attention: 21 lines in your changes are missing coverage. Please review.

Comparison is base (7dabc47) 85.41% compared to head (a948bee) 85.24%.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #2466 +/- ## ============================================ - Coverage 85.41% 85.24% -0.17% + Complexity 907 906 -1 ============================================ Files 123 123 Lines 5204 5239 +35 Branches 521 531 +10 ============================================ + Hits 4445 4466 +21 - Misses 593 599 +6 - Partials 166 174 +8 ``` | [Flag](https://app.codecov.io/gh/mozilla/gcp-ingestion/pull/2466/flags?src=pr&el=flags&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=mozilla) | Coverage Δ | | |---|---|---| | [ingestion_beam](https://app.codecov.io/gh/mozilla/gcp-ingestion/pull/2466/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=mozilla) | `82.72% <76.40%> (-0.20%)` | :arrow_down: | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=mozilla#carryforward-flags-in-the-pull-request-comment) to find out more. | [Files](https://app.codecov.io/gh/mozilla/gcp-ingestion/pull/2466?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=mozilla) | Coverage Δ | | |---|---|---| | [...ualservices/ContextualServicesReporterOptions.java](https://app.codecov.io/gh/mozilla/gcp-ingestion/pull/2466?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=mozilla#diff-aW5nZXN0aW9uLWJlYW0vc3JjL21haW4vamF2YS9jb20vbW96aWxsYS90ZWxlbWV0cnkvY29udGV4dHVhbHNlcnZpY2VzL0NvbnRleHR1YWxTZXJ2aWNlc1JlcG9ydGVyT3B0aW9ucy5qYXZh) | `0.00% <ø> (ø)` | | | [.../mozilla/telemetry/ContextualServicesReporter.java](https://app.codecov.io/gh/mozilla/gcp-ingestion/pull/2466?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=mozilla#diff-aW5nZXN0aW9uLWJlYW0vc3JjL21haW4vamF2YS9jb20vbW96aWxsYS90ZWxlbWV0cnkvQ29udGV4dHVhbFNlcnZpY2VzUmVwb3J0ZXIuamF2YQ==) | `0.00% <0.00%> (ø)` | | | [...a/telemetry/contextualservices/VerifyMetadata.java](https://app.codecov.io/gh/mozilla/gcp-ingestion/pull/2466?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=mozilla#diff-aW5nZXN0aW9uLWJlYW0vc3JjL21haW4vamF2YS9jb20vbW96aWxsYS90ZWxlbWV0cnkvY29udGV4dHVhbHNlcnZpY2VzL1ZlcmlmeU1ldGFkYXRhLmphdmE=) | `90.90% <40.00%> (+3.95%)` | :arrow_up: | | [...elemetry/contextualservices/ParseReportingUrl.java](https://app.codecov.io/gh/mozilla/gcp-ingestion/pull/2466?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=mozilla#diff-aW5nZXN0aW9uLWJlYW0vc3JjL21haW4vamF2YS9jb20vbW96aWxsYS90ZWxlbWV0cnkvY29udGV4dHVhbHNlcnZpY2VzL1BhcnNlUmVwb3J0aW5nVXJsLmphdmE=) | `80.24% <91.66%> (+0.78%)` | :arrow_up: | | [.../telemetry/contextualservices/FilterByDocType.java](https://app.codecov.io/gh/mozilla/gcp-ingestion/pull/2466?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=mozilla#diff-aW5nZXN0aW9uLWJlYW0vc3JjL21haW4vamF2YS9jb20vbW96aWxsYS90ZWxlbWV0cnkvY29udGV4dHVhbHNlcnZpY2VzL0ZpbHRlckJ5RG9jVHlwZS5qYXZh) | `68.00% <64.70%> (-13.82%)` | :arrow_down: | ... and [1 file with indirect coverage changes](https://app.codecov.io/gh/mozilla/gcp-ingestion/pull/2466/indirect-changes?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=mozilla)

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

quiiver commented 1 year ago

flagging @chelseatroy and @ncloudioj for additional 👀

relud commented 11 months ago

@whd fyi: merging this now because validation testing showed that it has no impact without config changes. i will be filing a dsre ticket to request the config changes before the change freeze, in sync with merging a related bigquery-etl pr and a communication to downstream consumers that internal changes are occurring.

whd commented 11 months ago

merging this now because validation testing showed that it has no impact without config changes

This has been deployed to prod in case you want to triple check it's a no-op without config changes.