ahx / openapi_first

openapi_first is a Ruby gem for request / response validation and contract-testing against an OpenAPI API description. It makes APIFirst easy and reliable.
MIT License
123 stars 15 forks source link

Self referencing schemas results in `SystemStackError` #291

Open moberegger opened 2 months ago

moberegger commented 2 months ago

When providing openapi_first a spec containing a schema like

components:
  schemas:
    MySelfRef:
      type: object
      properties:
        foo: 
          type: string     
        bar: 
          $ref: '#/component/schemas/MySelfRef'

it hits a stack level too deep (SystemStackError) error.

I have a general idea of what is happening. openapi_first is successfully loading the schema, but in doing so it creates self referencing hashes as it de-$references everything. So it basically ends up doing something like this:

my_schema = {
  type: 'object',
  required: ['foo'],
  properties: {
    foo: {
      type: 'string',
    },
    bar: {
      '$ref': '#/component/schemas/BoomRef',
    },
  },
}
my_schema[:properties][:bar][:$ref] = my_schema

This schema gets passed off to json_schemer to build a new JSONSchemer::Schema which works until something is executed on it. For example:

boom =  JSONSchemer.schema(my_schema) # No error, creates a JSONSchemer::Schema
# Later on something runs...
boom.valid_schema? # Results in SystemStackError

This happens because openapi_first de-references everything to keep the entire schema local. It doesn't know that '#/component/schemas/BoomRef' is self referencing, so it de-references that and basically picks itself up from the file_cache and creates a self referencing hash. This causes an error when trying to any recursive work on it; in the example above, json_schemer attempts a deep_stringify_keys that runs forever until the stack error hits.

There are a few ways to do self referencing schemas like this in Openapi. One way is like

my_schema = {
  type: 'object',
  required: ['foo'],
  properties: {
    foo: {
      type: 'string',
    },
    bar: {
      '$ref': '#',
    },
  },
}

but during dereferencing openapi_first sees that '#' and tries to load it from a file, which results in an error.

Another way is

my_schema = {
  type: 'object',
  required: ['foo'],
  '$dynamicAnchor': 'node'.
  properties: {
    foo: {
      type: 'string',
    },
    bar: {
      '$dynamicRef': 'node'.
    },
  },
}

which seems to work, but requires changes to the schema.

It would be nice if openapi_first could handle that first schema. I think what we could do is compare object_ids during the de-referencing process, and if we encounter a schema whose object_id matches the object_id of the thing it is attempting to make a $ref to, we simply swap that value out with a '#'.

Additionally, it would be nice if openapi_first could handle the second schema example. I think what we could do here is modify Dereferencer slightly so that when it encounters a bare '#' it skips trying to resolve it and instead just uses that as the $ref value.

Do you have any thoughts on this? I think I have a pretty good idea of how it works, so I could try submitting a PR to accommodate those if you find either solution acceptable.

ahx commented 2 months ago

Hi. Thanks for the report. In general, and also related to https://github.com/ahx/openapi_first/issues/285, I would like to change from "dereference everything" to "dereference everything outside JSON Schema schemas" and let json_schemer handle $refs inside JSON Schemas (local and across files) correctly via it's .openapi interface. Do you think that makes sense and could work? I have not fully wrapped my head around that, so I think your option sounds good as well. A PR would be very much appreciated. If you don't get a fix to work, just a failing test would also be much appreciated.