ConsumerDataStandardsAustralia / standards-maintenance

This repository houses the interactions, consultations and work management to support the maintenance of baselined components of the Consumer Data Right API Standards and Information Security profile.
41 stars 9 forks source link

Get Metrics V5 error metrics documentation #655

Open nils-work opened 2 months ago

nils-work commented 2 months ago

Description

To ensure the requirements are clear, the field descriptions in the ErrorMetricsV2 schema could indicate that errors are to be reported against each respective error code in the 4xx and 5xx series where the example additionalProperties and property1 and property2 fields currently appear.

Intention and Value of Change

Ensure compliant Get Metrics responses are provided to allow detailed analysis of ecosystem performance.

Area Affected

Get Metrics endpoint > ErrorMetricsV2 schema

Change Proposed

Make the following changes to the documentation only, to provide clarity of the existing requirement. No changes to the endpoint version or structure are proposed.

Name Description
»» additionalProperties Number of errors for a specific HTTP error code. Note that the property name must be 3 digits represent the HTTP error code the error is for
This is a placeholder field to be substituted with each respective HTTP error code in the 4xx and 5xx range recorded by the Data Holder. It is represented by property1 and property2 in the Non-normative Examples section. Note that the property name MUST be the three-digit HTTP error code as per the adjacent 500 example. All possible property names have not been defined as the range is expected to vary across participants. Examples would include, but are not limited to: 400, 401, 403, 404, 405, 406, 415, 422, 429, 500, 503, 504.
»» 500 Number of errors for HTTP error code 500. Note that this field is an example of a single entry due to the lack of OAS support [for the] JSON Schema patternProperties syntax. See the additionalProperties field in this schema for the generic property structure for error code counts
Reflecting the description provided in the adjacent additionalProperties field, this is an example demonstrating the structure for reporting the number of calls resulting in HTTP error code 500. Each error code recorded by the Data Holder in the 4xx and 5xx range MUST be provided in this format against the respective property name.
perlboy commented 2 months ago

The published openapi specification does not specify these additional error codes and the reason provided isn't justification (all the codes can be listed and specified as optional). On this basis this is not a non-breaking change, will require an update to the API specification and an associated FDO.

cuctran-greatsouthernbank commented 4 days ago

Hi @nils-work, When we implemented ErrorMetricsV2, our interpretation was that we were required to report the server-side errors only. This was written in the Description for Aggregate error metric and by that, should also be applicable to the Unauthenticated and Authenticated error metrics.

If 4xx errors are required to be reported, then I suggest we also change the description of the Aggregate error metric. That would become a breaking change for GSB.

image

nils-work commented 4 days ago

Hi @cuctran-greatsouthernbank

The aggregate property is a continuation of the error reporting that was available prior to Get Metrics v4, which was expected to capture server-side (5xx) codes only. The earlier structure was retained as aggregate in v4 to facilitate the reporting transition for the ACCC. It is unrelated to the change proposed in this issue and will remain unaffected.

The breakdown by unauthenticated and authenticated, and per code (in the 4xx and 5xx range) in v4 was to provide greater insight.

The detail from the Decision that introduced these fields in v4 stated:

  • The ErrorMetrics model will be changed from a number to an object with the following:
    • Two fields containing objects named authenticated and unauthenticated to separate the errors for authenticated vs unauthenticated APIs
    • Each of these objects will contain objects per period containing a series of fields with the label of each field being a HTTP Status Code (e.g. 422, 500, etc) and the value being a number indicating the number of errors for the period

Guidance on error reporting, including some of this detail is available in this article - Errors.