belgif / rest-guide

REST Guidelines of Belgian government institutions
https://www.belgif.be/specification/rest/api-guide/
Apache License 2.0
24 stars 3 forks source link

Standardization of cross-parameter validation issue types #113

Open jpraet opened 1 year ago

jpraet commented 1 year ago

Standardize some general validation issues for constraints that cannot be enforced through schema validation (related to #108):

pvdbosch commented 1 year ago

Match for id between PUT path and body is normally a programmatic error in client. Structure like this should suffice:

PUT /address/456

{ "id": "123", }

What if id from path param is different from body?

{
  "type": "urn:problem-type:belgif:badRequest",
  "href": "https://www.belgif.be/specification/rest/api-guide/problems/badRequest.html",
  "status": 400,
  "title": "Bad Request",
  "detail": "The input message is incorrect",
  "issues": [
    {
      "type": "urn:problem-type:belgif:input-validation:shouldMatchPathParameter",
      "detail": "id in body should match value 456 of path parameter",
      "in": "body",
      "name": "id",
      "value": "123",
    }
 ]
}
pvdbosch commented 1 year ago

mutuallyExclusiveParameters use case: when a search is possible on either a ("alternate" non-primary) ID or a name, but not on both at the same time.

GET /employers?enterpriseNumber=1234&name=Smals

{
  "type": "urn:problem-type:belgif:badRequest",
  "href": "https://www.belgif.be/specification/rest/api-guide/problems/badRequest.html",
  "status": 400,
  "title": "Bad Request",
  "detail": "The input message is incorrect",
  "issues": [
    {
      "type": "urn:problem-type:belgif:input-validation:mutuallyExclusiveParameters",
      "detail": "Search allowed on either companyId or name, but not both",
      "in": "query",
      "name": "name",
      "value": "Smals"
    }
 ]
}

alternatives: 1) just include one of the params as conflicting, and explain in detail description (as above) 2) include two issues, one for each param (but duplicates one actual issue to two issues then) 3) add "relatedTo" property with parameter array to InputValidationIssue model, like below

    {
      "type": "urn:problem-type:belgif:input-validation:mutuallyExclusiveParameters",
      "detail": "Search allowed on either companyId or name, but not both",
      "in": "query",
      "name": "name",
      "value": "Smals",
      "relatedTo": ["in": "query", "name": "companyId", "value": "1234" ] 
    }

Program logic has to (arbitrarily?) decide which one is "main" parameter or "relatedTo".

pvdbosch commented 1 year ago

use case incompleteParameterGroup: when searching on alternate id, both query params alternateIdType and alternateIdValue need to be present. other use case: if searching a person on house number, then the street and municipality is also required

GET /organizations?alternateIdType=companyId

{
  "type": "urn:problem-type:belgif:badRequest",
  "href": "https://www.belgif.be/specification/rest/api-guide/problems/badRequest.html",
  "status": 400,
  "title": "Bad Request",
  "detail": "The input message is incorrect",
  "issues": [
    {
      "type": "urn:problem-type:belgif:input-validation:incompleteParameterGroup",
      "detail": "When alternateIdType is used, alternateIdValue query param should be present as well",
      "in": "query",
      "name": "alternateIdType"
    }
 ]
}

1) put everything only in detail description, refer to only one of parameters in issue structure 2) Add additional property "relatedTo": ["in": "query", "name": "alternateIdValue"] ; logic has to decide which one is in "main" param to put in issue, and which one in relatedTo 3) serialize an object into query parameters, but this has some limitations (TODO: research query serialization limitations - related to #56)

pvdbosch commented 1 year ago

@wsalembi questioned practice of using key/value properties during this discussion; I opened #124 to add this discussion to backlog.

pvdbosch commented 1 year ago

@jpraet , I changed the issue opener, removing some parts I split off to new separate issues:

pvdbosch commented 1 year ago

Discussed on WG:

If there are many cases to standardize, use a common prefix like urn:problem-type:belgif:input-validation:invalidCombination: or urn:problem-type:belgif:input-validation:dependencyViolation:.

The name "parameter" in OpenAPI isn't used for fields within the body part, so its use in the name is not appropriate. In Swagger/OpenAPI 2.0, the entire payload body was considered a single parameter. Best to rename recently added "unknownParameter" validation issue then.

Use cases and naming could be inspired on https://medium.com/isa-group/inter-parameter-dependencies-in-rest-apis-4664e901c124. Alternatively, some issue types might be reformulated,

Naming ideas:

1) mutuallyExclusiveParameters (when a choice should be made between mutually exclusive parameters) "zeroOrOne" description: "Maximum one of following fields should be present: companyId, name"

2) exactlyOne (exactlyOne is clearer than onlyOne) description: "Exactly one of following fields should be present: companyId, name"

3) incompleteParameterGroup or "allOrNone"

4) mismatchedParameters (when the value of 2 parameters should match, e.g. resource id in path and body for a full update) - or "shouldBeEqual"

5) missingRequiredParameter or atLeastOne

TODO: is it possible to reduce the number of above restrictions by replacing them by a combination of separate ones? TODO: research in latest JSON schema standard what's the kind of validations supported in latest version, and try to align with it.

jpraet commented 10 months ago

Has there been any progress on this? I recall from last WG meeting that @wsalembi and @pvdbosch were going to do some further analysis and come up with a proposal. Would be nice to see a draft of this that we could review in preparation for next WG.

pvdbosch commented 10 months ago

no progress yet, I'll keep you posted. I expect to work on it +/- two weeks before the next WG.

pvdbosch commented 10 months ago

Overview of relevant JSON Schema keywords:

If we'd model the above use cases using these, we'd get following below. Of course, these keywords can't actually be used across multiple query parameters and a body parameter.

  1. zeroOrOne

    oneOf:
    - required: [propertyOne]
    - required: [propertyTwo]
    - required: [propertyThree]
    - not:
      anyOf: [propertyOne, propertyTwo, propertyThree]
  2. exactlyOne

oneOf:
- required: [propertyOne]
- required: [propertyTwo]
  1. allOrNone
oneOf:
- required: [propertyOne, propertyTwo, propertyThree]
- not:
      anyOf: [propertyOne, propertyTwo, propertyThree]
  1. shouldBeEqual -> don't see a way to do this

  2. atLeastOne

anyOf:
- required: [propertyOne]
- required: [propertyTwo]

Next, I'll try to propose issue types, aligned with this where possible.

update: fixed some errors: not: required -> not: anyOf

pvdbosch commented 10 months ago

Proposal:

  1. urn:problem-type:belgif:input-validation:zeroOrOneOfRequired
  2. urn:problem-type:belgif:input-validation:exactlyOneOfRequired (exactlyOneOf is clearer to me than just 'oneOf')
  3. urn:problem-type:belgif:input-validation:allOfOrNoneRequired
  4. urn:problem-type:belgif:input-validation:mustBeEqual
  5. urn:problem-type:belgif:input-validation:anyOfRequired

update: edited based on @jpraet 's feedback

While theoretically you could express "exactlyOneOfRequired" as applying both "zeroOrOneRequired" and "anyOfRequired", it seems more expressive to me, or would be confusing or renaming would be inconsistent with JSON schema naming:

jpraet commented 10 months ago

The *Required suffix is a bit confusing to me because, as I understand it, it means something else here than "required" in openapi spec? e.g. by zeroOrOneRequired you mean there must be zero or one of a given set of inputs, correct? but then it is inconsistent for allOrNone as I would expect that one to be called allOrNoneRequired.

Could we drop the *Required suffix?

And zerorOrOne also seems inconsistent with exactlyOneOf? Either zeroOrOne should be renamed to zeroOrOneOf, or exactlyOneOf to exactlyOne?

There's also something that bothers me slightly and that's that existing issue types like urn:problem-type:belgif:input-validation:schemaViolation and urn:problem-type:belgif:input-validation:unknownParameter and proposals in https://github.com/belgif/rest-guide/issues/126 are named after "what went wrong", whereas these are named after "what was expected". It's a subtle difference, I don't know if I'm making myself clear? e.g. for urn:problem-type:belgif:input-validation:mustBeEqual, a naming style that would be more consistent with the preexisting issue types is urn:problem-type:belgif:input-validation:notEqual or urn:problem-type:belgif:input-validation:mismatch. But I think such naming style is difficult for the other issue types in this proposal.

pvdbosch commented 10 months ago

but then it is inconsistent for allOrNone as I would expect that one to be called allOrNoneRequired.

existing issue types [and problem types] are named after "what went wrong", whereas these are named after "what was expected". But I think such naming style is difficult for the other issue types in this proposal.

I forgot to add "Required" for allOrNone, added it now.

Still TBD if "Required" suffix is necessary. I actually added it to avoid misinterpretation that the issue type is "what was expected". For the existing problem types, it's quite clear that they are to be interpreted as "what went wrong".

So I'd suggest to either keep the suffix or try to come up with a unambiguous "what went wrong" naming style.

Either zeroOrOne should be renamed to zeroOrOneOf, or exactlyOneOf to exactlyOne?

Right. Renamed it to "zeroOrOneOf". TBD if "Of" suffix is necessary (alignment with OpenAPI)?

pvdbosch commented 10 months ago

A half attempt at alternative "what went wrong" naming style:

pvdbosch commented 9 months ago

suffix: Expected instead of Required is preferred

| zeroOrOneOfExpected | moreThanOne/tooMany/mutuallyExclusive | | anyOfExpected | missingOneOf | | exactlyOneOfExpected | missingOneOf or moreThanOne/tooMany/mutuallyExclusive (dependent on input) | | zeroOrAllOfExpected | incompleteGroup | | expectedEqual | mismatch |

@JDMKSZ : preference for combined exactlyOneOfExpected (or similar) instead of split into two; easier for client and server logic.

@VirginieHayot / @wsalembi : worthwhile to standardize? Wouldn't free text suffice bc mainly used for developers, not UI/end users? @JDMKSZ : can help to make development faster, not having to formulate own message

@wsalembi : simplified issue types proposal: (1) missing (2) unexpected (3) conflict

jpraet commented 9 months ago

@pvdbosch @wsalembi @VirginieHayot @JDMKSZ

Let me give this one more shot and try to summarize and clarify the standpoint of CBSS in this matter.

Cross-parameter input validations are omnipresent in REST APIs. For query parameters, and in some cases for body properties, these cross-parameter validations cannot be enforced through the OpenAPI schema. So it is up to the API developer to perform these validations and return a proper problem type.

Therefore, for the most common categories of cross-parameter input validations, a standardization is desirable for the issue type codes to use (https://github.com/belgif/rest-guide/issues/113), as well as for the way to represent multiple parameter references in the issue payload (https://github.com/belgif/rest-guide/issues/108). For us, these two tickets are intertwined. After all, if it is decided to standardize the issue type codes, then these will also need to be documented with examples.

Standardization in this area allows for a common reusable InputValidator to be implemented, so the API developer is not burdened with the tedious task of performing these validations and constructing the problem payload (which issue type, title, detail message to use? how to represent the parameters?).

As an example, let's say we have an API with "name", "alternateId.type" and "alternateId.value" query parameters:

new InputValidator()
        .mutuallyExclusive(queryParam("alternateId.value", alternateIdValue), queryParam("name", name))
        .inputGroup(queryParam("alternateId.type", alternateIdType), queryParam("alternateId.value", alternateIdValue))
        .ssin(queryParam("ssin", ssin))
        .validate();

Not let's say we invoke /api?name=test,alternateId.value=12345

The validation library would throw the following problem (returning multiple issues in one go):

{
  "type": "urn:problem-type:belgif:badRequest",
  "href": "https://www.belgif.be/specification/rest/api-guide/problems/badRequest.html",
  "title": "Bad Request",
  "status": 400,
  "detail": "The input message is incorrect",
  "issues": [
    {
      "type": "urn:problem-type:belgif:input-validation:mutuallyExclusive",
      "title": "Mutually exclusive inputs",
      "detail": "The inputs [alternateId.value, name] are mutually exclusive",
      "inputs": [
        {
          "in": "query",
          "name": "alternateId.value",
          "value": "12345"
        },
        {
          "in": "query",
          "name": "name",
          "value": "test"
        }
      ]
    },
    {
      "type": "urn:problem-type:belgif:input-validation:incompleteGroup",
      "title": "Incomplete input group",
      "detail": "The input group [alternateId.type, alternateId.value] is incomplete",
      "inputs": [
        {
          "in": "query",
          "name": "alternateId.type",
          "value": null
        },
        {
          "in": "query",
          "name": "alternateId.value",
          "value": "12345"
        }
      ]
    }
  ]
}

The validation library has access to all this information, so why not expose it in a structured manner to API consumers? Perhaps most API consumers will not interpret the problems in this much detail. But that is their choice and not ours to make?

Some of you are suggesting that returning only a "detail" message is sufficient. But then why do we even have "in"/"name"/"value" properties on InputValidationIssue at all? Why return "in"/"name"/"value" properties for a single-parameter input validation error, but not for a cross-parameter validation error?

Returning a problem that references multiple invalid parameters was possible in openapi-problem v1.0.0, but in v1.2.0 support for a problem with multiple input validation issues was added (also an important feature) at the expense of this pre-existing functionality.

To Smals, eHealth and other workgroup members: do you have any real-world examples of the problem payloads you currently return in your REST APIs when performing cross-parameter input validations? Both for format validations (that should perhaps only occur during development) as for business issues (where the API consumer is expected to handle them in production too). What do you put in the issue type? And do you fill in the "in"/"name"/"value"?

jpraet commented 6 months ago

CBSS uses the following issue types for cross-parameter validation based on the proposal in https://github.com/belgif/rest-guide/issues/113#issuecomment-1737436710:

urn:problem-type:cbss:input-validation:exactlyOneOfExpected urn:problem-type:cbss:input-validation:anyOfExpected urn:problem-type:cbss:input-validation:zeroOrExactlyOneOfExpected (instead of zeroOrOneOfExcepted, for consistency) urn:problem-type:cbss:input-validation:zeroOrAllOfExpected urn:problem-type:cbss:input-validation:equalExpected (instead of expectedEqual, for consistency)

And we return an extension of the Belgif InputValidationIssue with an inputs[] array for these issue types:

{
  "type": "urn:problem-type:belgif:badRequest",
  "href": "https://www.belgif.be/specification/rest/api-guide/problems/badRequest.html",
  "title": "Bad Request",
  "status": 400,
  "detail": "The input message is incorrect",
  "issues": [
    {
      "type": "urn:problem-type:cbss:input-validation:exactlyOneOfExpected",
      "title": "Exactly one of these inputs must be present",
      "detail": "Exactly one of these inputs must be present: sector, enterpriseNumber",
      "inputs": [
          {
            "in": "query",
            "name": "sector",
            "value": null
          },
          {
            "in": "query",
            "name": "enterpriseNumber",
            "value": null
          }
        ]
    }
  ]
}
jpraet commented 1 month ago

As these urn:problem-type:cbss:input-validation: cross-parameter issue types are now also present in the Belgif rest-problem library, it would be good if they could be standardized under urn:problem-type:belgif:input-validation: