mozilla-services / mozilla-pipeline-schemas

Schemas for Mozilla's data ingestion pipeline and data lake outputs
https://protosaur.dev/mps-deploys/
Other
46 stars 95 forks source link

Schema errors for untrustedModules #226

Closed whd closed 5 years ago

whd commented 5 years ago

/CC @CarlCorcoran @fbertsch

Now that we're properly applying schemas to the untrustedModules ping, it looks like it in about 14% of cases the pings aren't conforming:

    Ingestion Data for the Last Hour
================================
percent_error    : 14.2613
max_percent_error: 1

graph: https://pipeline-cep.prod.mozaws.net/dashboard_output/graphs/analysis.moz_telemetry_doctype_monitor_untrustedModules.ingestion_error.html

Diagnostic (count/error)
========================
31      schema: untrustedModules version: 4 validation error: SchemaURI: #/properties/payload/properties/combinedStacks/properties/stacks/items/items/items/0 Keyword: minimum DocumentURI: #/payload/combinedStacks/stacks/0/2/0
31      schema: untrustedModules version: 4 validation error: SchemaURI: #/properties/payload/properties/combinedStacks/properties/stacks/items/items/items/0 Keyword: minimum DocumentURI: #/payload/combinedStacks/stacks/0/1/0
26      schema: untrustedModules version: 4 validation error: SchemaURI: #/properties/payload/properties/combinedStacks/properties/stacks/items/items/items/0 Keyword: minimum DocumentURI: #/payload/combinedStacks/stacks/2/19/0
[...]

Since DecodeErrorDetail isn't available via the email, here's an example:

{"minimum":{"actual":-1,"expected":0,"instanceRef":"#/payload/combinedStacks/stacks/2/19/0","schemaRef":"#/properties/payload/properties/combinedStacks/properties/stacks/items/items/items/0"}}

And because it's actually somewhat difficult to pull this from an actual error message, (requires gzip), here's a potential value of combinedStacks/properties/stacks/items/items/items/0 that looks like it would cause the error:

[[[0,21035],[1,13385],[-1,18446744073709552000]],[]]

(the -1 is less than 0).

From the description of this value it doesn't look like this should ever be -1 (unless it's an index counting backwards) and that the schema is correct, but I don't have much context here. I've decided not to roll back the schema changes but we may need to reprocess some pings once this is sorted out.

whd commented 5 years ago

In case it's more helpful:

"payload": {
    "combinedStacks": {
        "memoryMap": [
            [
                "mozglue.pdb",
                "AA23EBDA96F2433BFAB2CB85597C378F1"
            ],
            [
                "mbae64.pdb",
                "5DA02C492F9F412984B4384EF5B3CFDC1"
            ]
        ],
        "stacks": [
            [
                [
                    0,
                    21035
                ],
                [
                    1,
                    13385
                ],
                [
                    -1,
                    18446744073709552000
                ]
            ],
            []
        ]
    },

 ...
 }
CarlCorcoran commented 5 years ago

Aha yes: Indeed the module index can be -1, indicating the address doesn't land in any known module. The schema should be corrected allow a minimum of -1 for this . Shall I do this via a pull request?

https://github.com/mozilla-services/mozilla-pipeline-schemas/blob/dev/templates/telemetry/untrustedModules/untrustedModules.4.schema.json#L50

whd commented 5 years ago

Yes, file a PR and I'll deploy it today/tomorrow (even though technically there is a change freeze starting today).

CarlCorcoran commented 5 years ago

PR sent: https://github.com/mozilla-services/mozilla-pipeline-schemas/pull/227

whd commented 5 years ago

227 did the trick: https://pipeline-cep.prod.mozaws.net/dashboard_output/graphs/analysis.moz_telemetry_doctype_monitor_untrustedModules.ingestion_error.html