ahx / openapi_first

openapi_first is a Ruby gem for request / response validation and contract-testing against an OpenAPI API description. It makes APIFirst easy and reliable.
MIT License
122 stars 15 forks source link

Cannot validate responses that use `oneOf` unions with `discriminator`. #285

Open moberegger opened 4 months ago

moberegger commented 4 months ago

Ran into an issue with oneOf unions, but only when using a discriminator property. I am not sure if this is an openapi_first issue or a json_schemer issue, so I'd though I'd try here first.

Given a simple YAML configuration like

---
openapi: 3.1.0
info:
  title: API V1
  version: v1
paths:
  "/pets":
    get:
      summary: Pets
      responses:
        '200':
          description: successful
          content:
            application/json:
              schema:
                items:
                  oneOf:
                    - '$ref': '#/components/schemas/Cat'
                    - '$ref': '#/components/schemas/Dog'
                  discriminator: 
                    propertyName: petType
                type: array
components:
  schemas:
    Cat:
      type: object
      properties:
        id: 
          type: integer
        petType: 
          type: string
          enum:
            - cat
        meow:
          type: string
    Dog:
      type: object
      properties:
        id: 
          type: integer
        petType: 
          type: string
          enum:
            - dog
        bark:
          type: string

and using the ResponseValidation middleware

config.middleware.use OpenapiFirst::Middlewares::ResponseValidation, spec: 'swagger/v1/swagger.yaml', raise_error: true

and responding with some simple hardcoded data

class PetsController < ApplicationController
  def index
    # Hardcode a response
    render json: [
      { id: 1, petType: 'cat', meow: 'Prrr' },
      { id: 2, petType: 'dog', bark: 'Woof' }
    ], status: :ok
  end
end

The following error occurs

{
    "status": 500,
    "error": "Internal Server Error",
    "exception": "#<KeyError: key not found: \"$ref\">",
    "traces": {
        "Application Trace": [],
        "Framework Trace": [
            {
                "exception_object_id": 9400,
                "id": 0,
                "trace": "json_schemer (2.3.0) lib/json_schemer/openapi31/vocab/base.rb:59:in `fetch'"
            },
            {
                "exception_object_id": 9400,
                "id": 1,
                "trace": "json_schemer (2.3.0) lib/json_schemer/openapi31/vocab/base.rb:59:in `block in subschemas_by_property_value'"
            },
            {
                "exception_object_id": 9400,
                "id": 2,
                "trace": "json_schemer (2.3.0) lib/json_schemer/openapi31/vocab/base.rb:58:in `each'"
            },
            {
                "exception_object_id": 9400,
                "id": 3,
                "trace": "json_schemer (2.3.0) lib/json_schemer/openapi31/vocab/base.rb:58:in `subschemas_by_property_value'"
            },
            {
                "exception_object_id": 9400,
                "id": 4,
                "trace": "json_schemer (2.3.0) lib/json_schemer/openapi31/vocab/base.rb:110:in `validate'"
            },
            {
                "exception_object_id": 9400,
                "id": 5,
                "trace": "json_schemer (2.3.0) lib/json_schemer/schema.rb:141:in `block in validate_instance'"
            },
            {
                "exception_object_id": 9400,
                "id": 6,
                "trace": "json_schemer (2.3.0) lib/json_schemer/schema.rb:140:in `each'"
            },
            {
                "exception_object_id": 9400,
                "id": 7,
                "trace": "json_schemer (2.3.0) lib/json_schemer/schema.rb:140:in `validate_instance'"
            },
            {
                "exception_object_id": 9400,
                "id": 8,
                "trace": "json_schemer (2.3.0) lib/json_schemer/draft202012/vocab/applicator.rb:184:in `block in validate'"
            },
            {
                "exception_object_id": 9400,
                "id": 9,
                "trace": "json_schemer (2.3.0) lib/json_schemer/draft202012/vocab/applicator.rb:183:in `map'"
            },
            {
                "exception_object_id": 9400,
                "id": 10,
                "trace": "json_schemer (2.3.0) lib/json_schemer/draft202012/vocab/applicator.rb:183:in `with_index'"
            },
            {
                "exception_object_id": 9400,
                "id": 11,
                "trace": "json_schemer (2.3.0) lib/json_schemer/draft202012/vocab/applicator.rb:183:in `validate'"
            },
            {
                "exception_object_id": 9400,
                "id": 12,
                "trace": "json_schemer (2.3.0) lib/json_schemer/schema.rb:141:in `block in validate_instance'"
            },
            {
                "exception_object_id": 9400,
                "id": 13,
                "trace": "json_schemer (2.3.0) lib/json_schemer/schema.rb:140:in `each'"
            },
            {
                "exception_object_id": 9400,
                "id": 14,
                "trace": "json_schemer (2.3.0) lib/json_schemer/schema.rb:140:in `validate_instance'"
            },
            {
                "exception_object_id": 9400,
                "id": 15,
                "trace": "json_schemer (2.3.0) lib/json_schemer/schema.rb:106:in `validate'"
            },
            {
                "exception_object_id": 9400,
                "id": 16,
                "trace": "openapi_first (2.1.0) lib/openapi_first/schema.rb:29:in `validate'"
            },
            {
                "exception_object_id": 9400,
                "id": 17,
                "trace": "openapi_first (2.1.0) lib/openapi_first/validators/response_body.rb:25:in `call'"
            },
            {
                "exception_object_id": 9400,
                "id": 18,
                "trace": "openapi_first (2.1.0) lib/openapi_first/response_validator.rb:22:in `block (2 levels) in call'"
            },
            {
                "exception_object_id": 9400,
                "id": 19,
                "trace": "openapi_first (2.1.0) lib/openapi_first/response_validator.rb:22:in `each'"
            },
            {
                "exception_object_id": 9400,
                "id": 20,
                "trace": "openapi_first (2.1.0) lib/openapi_first/response_validator.rb:22:in `block in call'"
            },
            {
                "exception_object_id": 9400,
                "id": 21,
                "trace": "openapi_first (2.1.0) lib/openapi_first/response_validator.rb:21:in `catch'"
            },
            {
                "exception_object_id": 9400,
                "id": 22,
                "trace": "openapi_first (2.1.0) lib/openapi_first/response_validator.rb:21:in `call'"
            },
            {
                "exception_object_id": 9400,
                "id": 23,
                "trace": "openapi_first (2.1.0) lib/openapi_first/response.rb:29:in `validate'"
            },
            {
                "exception_object_id": 9400,
                "id": 24,
                "trace": "openapi_first (2.1.0) lib/openapi_first/definition.rb:79:in `validate_response'"
            },
            {
                "exception_object_id": 9400,
                "id": 25,
                "trace": "openapi_first (2.1.0) lib/openapi_first/middlewares/response_validation.rb:28:in `call'"
            },
            {
                "exception_object_id": 9400,
                "id": 26,
                "trace": "openapi_first (2.1.0) lib/openapi_first/middlewares/request_validation.rb:36:in `call'"
            },

            # Snipped for brevity
        ]
    }
}

If I remove the discriminator property and just have

    Pet:
      oneOf:
        - '$ref': '#/components/schemas/Cat'
        - '$ref': '#/components/schemas/Dog'

Everything works.

[
    {
        "id": 1,
        "petType": "cat",
        "meow": "Prrr"
    },
    {
        "id": 2,
        "petType": "dog",
        "bark": "Woof"
    }
]

I dug into the openapi_first and json_schemer source a bit and what I think is happening is that json_schemer expects those $refs to be there, but openapi_first dereferences them out when creating the schemas. I can see a response schema being created with the spec doc loaded as

{"items"=>
  {"oneOf"=>
    [{"type"=>"object", "properties"=>{"id"=>{"type"=>"integer"}, "petType"=>{"type"=>"string", "enum"=>["cat"]}, "meow"=>{"type"=>"string"}}},
     {"type"=>"object", "properties"=>{"id"=>{"type"=>"integer"}, "petType"=>{"type"=>"string", "enum"=>["dog"]}, "bark"=>{"type"=>"string"}}}],
   "discriminator"=>{"propertyName"=>"petType"}},
 "type"=>"array"}

The presence of the discriminator seems to trigger json_schemer to look for the $refs. The tests seem to indicate that json_schemer supports oneOf unions, but they suggest that it needs to be aware of the entire schema (or at least the refs), where as openapi_first takes a modular approach and creates separate request and response schemas for each path.

Again, don't know if this is an openapi_first problem or a json_schemer problem. It does seem like json_schemer expects the entire specification to be given to it, which is not how openapi_first is currently using it. That being said, this is the only issue I have run into thus far in a project that is making heavy use of $refs, so perhaps it's a json_schemer bug with discriminators because that shouldn't assume that $refs are being used.

Perhaps instead of dereferencing, a ref map is created that can then be provided to json_schemer

schema = {
  '$id' => 'http://example.com/schema',
  'allOf' => [
    { '$ref' => 'schema/one' },
    { '$ref' => 'schema/two' }
  ]
}
refs = {
  URI('http://example.com/schema/one') => {
    'type' => 'integer'
  },
  URI('http://example.com/schema/two') => {
    'minimum' => 11
  }
}
schemer = JSONSchemer.schema(schema, :ref_resolver => refs.to_proc)
moberegger commented 4 months ago

For posterity, I have also created an issue in json_schemer.

ahx commented 4 months ago

Thanks for the report, @moberegger.

openapi_first dereferences them out when creating the schemas

That is correct. I will have to take a deeper look if thatโ€™s an issue here. Iโ€™ll have a closer look and will get back to you.

UPDATE: This is the issue exaclty. By resolving all $refs during load openapi_first basically breaks discriminator. The OAS says : "When using the discriminator, inline schemas will not be considered."

Marking this as a bug.

ahx commented 4 months ago

Hey @moberegger I don't have a solution for this, but maybe a workaround: Maybe you can make use of one of the JSON schema conditionals instead of using discriminator?

There is a discussion about replacing / removing discriminator in OpenAPI 4.0

moberegger commented 3 months ago

Maybe you can make use of one of the JSON schema conditionals instead of using discriminator?

Haven't looked into this yet, so unsure if this would work; I am not too familiar with json-schema so still need to wrap my head around it. Unfortunately, for my current situation it's not an option for legacy tech debt reasons that are not at all your problem ๐Ÿ˜†. We currently have another soon-to-be-deprecated pipeline generating the schemas and getting that to be "json schema aware" isn't tenable. Our schema is also 3.0, which from my reading doesn't have full compatibility with json-schema.

I'm currently migrating our project to something green field (which your library has been an immense help with, by the way), and once that is done we'll have full control over the schema. Plan is to update the spec to 3.1 once it's all said and done, and then at that point I'll kick tires on your json-schema suggestion. Again, none of this is your problem! ๐Ÿ˜†

For the time being, I made a patch to openapi_first that removes the discriminator from the schema before creating the validator.

module Schema
      def initialize(schema, **kwargs)
        super(clean_unions!(schema), **kwargs)
      end

      private

      DISCRIMINATOR = 'discriminator'
      UNIONS = [ALL_OF = 'allOf', ANY_OF = 'anyOf', ONE_OF = 'oneOf'].freeze

      def clean_unions!(schema)
        schema.delete(DISCRIMINATOR) if UNIONS.any? { |key| schema.key?(key) }

        schema.each do |_, value|
          if value.is_a?(Hash)
            clean_unions!(value)
          elsif value.is_a?(Array)
            value.each { |element| clean_unions!(element) if element.is_a?(Hash) }
          end
        end
      end
    end

It's a bit of an aggressive approach, but has solved my problem for the time being.

I'd be more than happy to make a PR to openapi_first if a strategy similar to the above is an acceptable solution that you'd be happy with, but I don't know if this would cause problems for you elsewhere. Given your comments in https://github.com/davishmcclurg/json_schemer/issues/192#issuecomment-2263774926, maybe this helps keep openapi_first compatible?

ahx commented 3 months ago

I am happy to hear you found a solution to work around this issue ๐Ÿ”จ ๐Ÿ‘๐Ÿผ. Also appreciate the positive feedback.

I have not thought about it a lot, but I see two approaches solving this issue:

  1. Stop resolving all $refs on load and enable JSON Schema validation (via json_schemer) to resolve $refs by giving it a context of the the whole document not just a single schema instance
  2. Fix this discriminator case specifically, either during ref-resolving or by adapting the validation logic somehow. So something a long the lines you did, but the goal would be to keep validation intact somehow.

I feel like option 1 would be better, but I also don't know how to do that, yet.

moberegger commented 3 months ago

So something a long the lines you did, but the goal would be to keep validation intact somehow.

I just realized I didn't explain what the patch did ๐Ÿ˜†. The patch I provided just "tricks" json_schemer into not looking for the $refs and instead just has it validate across all the inline union types. There is definitely a tradeoff doing it this way, though; since it can't use the discriminator as a hint, I believe that the validator has to do more work since it has to validate against all the types rather than just the one the discriminator points to. I think this also results in more errors when validation fails, because you'll get multiple errors against each type instead of errors targeted to the type the discriminator points to. So it's still validating... but doing so in a much less efficient and intuitive way.

I do like your first option. In fact, that's how I thought openapi_first was working before reading through the source before creating this issue! Wish I could be of more help here, but I'm still rather new to the Openapi specification so still trying to get up to speed on that.

Sincerely appreciate your responsiveness on this. If I learn anything new or come up with another idea, I'll be sure to let you know. Least I can do.

ahx commented 1 month ago

This should be fixed on main via https://github.com/ahx/openapi_first/pull/294. I am keeping this open until the fix has been released.

Can you give this a try @moberegger ? Thanks again for the report!

moberegger commented 1 month ago

Tried it out and still see the unions being dereferenced. There are a couple differences between my setup and your integration test, though. One difference is that my union is nested inside other schemas and is not the top level schema response for the endpoint. The other difference is that we're using file references (ex: $ref: ./Dog.yml) which may be impacting how it gets loaded (I noticed that this process has also changed on main, so still familiarizing myself with it).

I'll try to dig in and see what is causing the difference in behaviour. It might be something I'm doing on my side that is causing the problem. If I'm able to figure out the root cause, I'll report back with a minimal reproduction/example.

ahx commented 1 month ago

The other difference is that we're using file references (ex: $ref: ./Dog.yml) which may be impacting how it gets loaded

Yes. And this should not work without using mapping, because JSON Schema cannot deduce "dog" from "schemas/dog.yaml". I just added another integration test to reproduce an issue with that and fixed it. It now uses the ref_resolver option of json_schemer to resolve the $ref (PR). My plan is to use json_schemer's ref_resolver option in more places and remove some code related to that.

If I'm able to figure out the root cause, I'll report back with a minimal reproduction/example.

Just a quick feedback on main or a reproducible example would be super great. I would like to add integration tests for these cases, solve them and then make the code easier to follow. โ€ฆand have main change less frequently.