python-openapi / openapi-core

Openapi-core is a Python library that adds client-side and server-side support for the OpenAPI v3.0 and OpenAPI v3.1 specification.
BSD 3-Clause "New" or "Revised" License
302 stars 132 forks source link

UnmarshallerError if specifying `format` with `type: [foo, 'null']` #483

Closed stephenfin closed 1 year ago

stephenfin commented 1 year ago

I'm still working on minimal reproducer to prove this out so apologies for the lack of a one in advance.

I've got a schema like this:

openapi: '3.1.0'
# ...
components:
  schemas:
    PatchList:
      type: 'object'
      properties:
        # ...
        pull_url:
          title: 'Pull URL'
          type:
            - 'string'
            - 'null'
          format: 'uri'
          maxLength: 255
        # ...

If my server returns a response with pull_url set to null, then the call to openapi_core.validation.response.validators.ResponseValidator.validate fails with the following traceback:

Traceback (most recent call last):
  File "/dev/patchwork/patchwork/tests/api/test_relation.py", line 267, in test_remove_one_patch_from_relation_good
    resp = self.client.patch(
  File "/dev/patchwork/patchwork/tests/api/utils.py", line 241, in patch
    validator.validate_data(
  File "/dev/patchwork/patchwork/tests/api/validator.py", line 141, in validate_data
    result.raise_for_errors()
  File "/dev/openapi-core/openapi_core/validation/datatypes.py", line 12, in raise_for_errors
    raise error
  File "/dev/openapi-core/openapi_core/validation/response/validators.py", line 245, in validate
    data = self._get_data(response, operation_response)
  File "/dev/openapi-core/openapi_core/validation/response/validators.py", line 88, in _get_data
    data = self._unmarshal(schema, casted)
  File "/dev/openapi-core/openapi_core/validation/validators.py", line 76, in _unmarshal
    return unmarshaller(value)
  File "/dev/openapi-core/openapi_core/unmarshalling/schemas/unmarshallers.py", line 91, in __call__
    return self.unmarshal(value)
  File "/dev/openapi-core/openapi_core/unmarshalling/schemas/unmarshallers.py", line 434, in unmarshal
    return unmarshaller.unmarshal(value)
  File "/dev/openapi-core/openapi_core/unmarshalling/schemas/unmarshallers.py", line 316, in unmarshal
    properties = self.format(value)
  File "/dev/openapi-core/openapi_core/unmarshalling/schemas/unmarshallers.py", line 325, in format
    return self._unmarshal_properties(formatted, schema_only=schema_only)
  File "/dev/openapi-core/openapi_core/unmarshalling/schemas/unmarshallers.py", line 354, in _unmarshal_properties
    all_of_properties = self._clone(all_of_schema).format(
  File "/dev/openapi-core/openapi_core/unmarshalling/schemas/unmarshallers.py", line 325, in format
    return self._unmarshal_properties(formatted, schema_only=schema_only)
  File "/dev/openapi-core/openapi_core/unmarshalling/schemas/unmarshallers.py", line 373, in _unmarshal_properties
    properties[prop_name] = self.unmarshallers_factory.create(
  File "/dev/openapi-core/openapi_core/unmarshalling/schemas/unmarshallers.py", line 91, in __call__
    return self.unmarshal(value)
  File "/dev/openapi-core/openapi_core/unmarshalling/schemas/unmarshallers.py", line 433, in unmarshal
    unmarshaller = self._get_best_unmarshaller(value)
  File "/dev/openapi-core/openapi_core/unmarshalling/schemas/unmarshallers.py", line 430, in _get_best_unmarshaller
    raise UnmarshallerError("Unmarshaller not found for type(s)")
openapi_core.unmarshalling.schemas.exceptions.UnmarshallerError: Unmarshaller not found for type(s)

If I comment out format, things work as expected. I suspect it's attempting to apply a formatter for the null values. As far as I can tell, this should work though.

stephenfin commented 1 year ago

This post and this SO answer suggests I could use oneOf to do this, but it seems we're not able to do so here:

openapi: '3.1.0'
# ...
components:
  schemas:
    PatchList:
      type: 'object'
      properties:
        # ...
        pull_url:
          title: 'Pull URL'
          oneOf:
            - type: 'string'
              format: 'uri'
              maxLength: 255
            - type: 'null'
        # ...

This yields the exact same exception.

stephenfin commented 1 year ago

Things do, however, work if I specify type twice:

openapi: '3.1.0'
# ...
components:
  schemas:
    PatchList:
      type: 'object'
      properties:
        # ...
        pull_url:
          title: 'Pull URL'
          type:
            - type: 'string'
            - type: 'null'
          oneOf:
            - type: 'string'
              format: 'uri'
              maxLength: 255
            - type: 'null'
        # ...

This shouldn't be necessary though, right?

p1c2u commented 1 year ago

Hi @stephenfin

thanks for reporting the issue.

The first example doesn't work because uri is string type format only so if you define multiple types it will fail with others (in this case null). This works as expected (but still not ideal).

The second example here is null is not considered as valid if type is not defined in main schema. Here we have the issue.

The third one is just a workaround for the second example issue which (as you noticed) shouldn't be necessary.

stephenfin commented 1 year ago

@p1c2u Thanks for confirming that the latter two examples are what we'd expect: the OpenAPI spec is seriously lacking when it comes to discussions on nullable attributes :disappointed: Is there anything I can do to help? I did try to address this but it seems there have been significant changes in the 0.17 beta and my test suite has yet to be adapted for this so I was kind of flying blind.

stephenfin commented 1 year ago

For what it's worth, the first example is rather common too, even if it's technically correct. For example, the OpenAPI 3.1 spec for GitHub's own API contains things like this:

---
openapi: 3.1.0
# ...
paths:
  # ...
  "/authorizations":
    # ...
    post:
      requestBody:
        required: false
        content:
          application/json:
            schema:
              type: object
              properties:
                scopes:
                  description: A list of scopes that this authorization is in.
                  type:
                  - array
                  - 'null'
                  items:
                    type: string
                  examples:
                  - public_repo
                  - user
                # ...

In this case, items only applies for type: array, not type: null, so this doesn't seem valid either. It's common though. Given the spec is silent on this matter, I wonder if this is something we want to choose to accept also?

stephenfin commented 1 year ago

Hmm, I actually think JSONSchema allows the first example (henceforth option A). From the JSON Schema Core spec:

7.6.1. Assertions and Instance Primitive Types

Most assertions only constrain values within a certain primitive type. When the type of the instance is not of the type targeted by the keyword, the instance is considered to conform to the assertion. For example, the "maxLength" keyword from the companion validation vocabulary [json-schema-validation]: will only restrict certain strings (that are too long) from being valid. If the instance is a number, boolean, null, array, or object, then it is valid against this assertion. This behavior allows keywords to be used more easily with instances that can be of multiple primitive types. The companion validation vocabulary also includes a "type" keyword which can independently restrict the instance to one or more primitive types. This allows for a concise expression of use cases such as a function that might return either a string of a certain length or a null value:

{
    "type": ["string", "null"],
    "maxLength": 255
}

If "maxLength" also restricted the instance type to be a string, then this would be substantially more cumbersome to express because the example as written would not actually allow null values. Each keyword is evaluated separately unless explicitly specified otherwise, so if "maxLength" restricted the instance to strings, then including "null" in "type" would not have any useful effect.

If this is true, I guess we should support it somehow?

EDIT: Also, from the JSON Schema Validation spec:

7.1. Foreword

Structural validation alone may be insufficient to allow an application to correctly utilize certain values. The "format" annotation keyword is defined to allow schema authors to convey semantic information for a fixed subset of values which are accurately described by authoritative resources, be they RFCs or other external specifications.

The value of this keyword is called a format attribute. It MUST be a string. A format attribute can generally only validate a given set of instance types. If the type of the instance to validate is not in this set, validation for this format attribute and instance SHOULD succeed. All format attributes defined in this section apply to strings, but a format attribute can be specified to apply to any instance types defined in the data model defined in the core JSON Schema. [json-schema] Note that the "type" keyword in this specification defines an "integer" type which is not part of the data model. Therefore a format attribute can be limited to numbers, but not specifically to integers. However, a numeric format can be used alongside the "type" keyword with a value of "integer", or could be explicitly defined to always pass if the number is not an integer, which produces essentially the same behavior as only applying to integers.

(emphasis mine)

p1c2u commented 1 year ago

@stephenfin 0.17 should be mostly compatible except validation errors. Can you tell which changes do you have in mind?

First option is definitely correct the problem is current implementation of format validators and unmarshallers which don't allow to handle multiple types with format correctly. I started the refactor of this part.

stephenfin commented 1 year ago

@stephenfin 0.17 should be mostly compatible except validation errors. Can you tell which changes do you have in mind?

I'm using the Django validators and seeing the following exceptions when installing the openapi-core-0.17.0a1 pre-release (pip install --pre --upgrade openapi-core):

Traceback (most recent call last):
  File "/usr/lib64/python3.8/unittest/case.py", line 60, in testPartExecutor
    yield
  File "/usr/lib64/python3.8/unittest/case.py", line 676, in run
    self._callTestMethod(testMethod)
  File "/usr/lib64/python3.8/unittest/case.py", line 633, in _callTestMethod
    method()
  File "/dev/patchwork/patchwork/tests/api/utils.py", line 104, in wrapper
    func(self, *args, **kwargs)
  File "/dev/patchwork/patchwork/tests/api/test_comment.py", line 189, in test_update_authorized
    resp = self._test_update(person=person)
  File "/dev/patchwork/patchwork/tests/api/test_comment.py", line 171, in _test_update
    return self.client.patch(
  File "/dev/patchwork/patchwork/tests/api/utils.py", line 84, in client_wrapper
    resp = orig_func(path, data, *orig_args, **orig_kwargs)
  File "/dev/patchwork/patchwork/tests/api/utils.py", line 241, in patch
    validator.validate_data(
  File "/dev/patchwork/patchwork/tests/api/validator.py", line 128, in validate_data
    validator = RequestValidator(schema_unmarshallers_factory)
  File "/dev/patchwork/.tox/py38-django41/lib/python3.8/site-packages/openapi_core/validation/request/validators.py", line 86, in __init__
    super().__init__(
  File "/dev/patchwork/.tox/py38-django41/lib/python3.8/site-packages/openapi_core/validation/validators.py", line 66, in __init__
    raise NotImplementedError(
NotImplementedError: schema_unmarshallers_factory is not assigned

I suspect I just need to update how I'm creating my validators. No big deal.

First option is definitely correct the problem is current implementation of format validators and unmarshallers which don't allow to handle multiple types with format correctly. I started the refactor of this part.

Excellent. Happy to test this when it's available! 🙋

p1c2u commented 1 year ago

Oh I see you use low level integration. You must pass spec at first argument to RequestValidator/ResponseValidator but probably I will make it backward compatible in the future pre-rereleases.

stephenfin commented 1 year ago

Spot on. Thank you as always. This is an excellent collection of tools 😊