voxpupuli / json-schema

Ruby JSON Schema Validator
MIT License
1.52k stars 241 forks source link

Unpredictable Failure to Locate `definitions` Section at Runtime #429

Open AdrianTP opened 5 years ago

AdrianTP commented 5 years ago

We are intermittently receiving the following error:

JSON::Schema::SchemaError:#012 The fragment '/definitions/<definition_name>' does not exist on schema file:<path/to/schema/file>.json#

The schema file has the definitions section in question, and it is being referenced appropriately ({ "$ref": "#/definitions/<definition_name>" }). The schema file contains a few external references, but I don't see how that would make a difference. I am aware of #229.

I have spent some time debugging and have not been able to discover the cause, nor am I able to reproduce the issue on my local machine -- this only happens in production. When I debugged it, I saw the definition was present in memory at the time of loading the schema. My best guess is some kind of memory issue or a race condition in the JSON Schema library itself which only happens at scale -- we currently see this a few times an hour, with no discernible pattern in timing, no apparent clustering of the issue, or anything else that would give me a clear picture of a possible cause.

Notes:

I am not comfortable posting the actual schema file here because it contains some proprietary information, but here's a pared-down example:

lib/schemas/client1_schema.json (modified to mask proprietary information):

{
  "id": "file:lib/schemas/client1_schema.json#",
  "$schema": "http://json-schema.org/draft-04/schema#",
  "title": "Client 1 API - Validation",
  "description": "A request object for validation",
  "definitions": {
    "range_block": {
      "type": "object",
      "properties": {
        "average": {
          "type": "number",
          "minimum": 0
        },
        "top_range": {
          "type": "number",
          "minimum": 0
        },
        "bottom_range": {
          "type": "number",
          "minimum": 0
        }
      }
    }
  },
  "type": "object",
  "required": [
    "identifier",
    "ranges"
  ],
  "properties": {
    "identifier": {
      "$ref": "common/identifier.json"
    },
    "ranges": {
      "type": "object",
        "range1": {
          "$ref": "#/definitions/range_block"
        },
        "range2": {
          "$ref": "#/definitions/range_block"
        }
      }
    }
  }
}

lib/schemas/common/identifier.json (modified to mask proprietary information):

{
  "id": "file:lib/schemas/common/identifier.json#",
  "$schema": "http://json-schema.org/draft-04/schema#",
  "title": "Identifier",
  "description": "An identifier",
  "type": "string",
  "minLength": 1,
  "maxLength": 64
}

We are calling it like so (modified to mask proprietary information):

class Request
  # left out initialize and other irrelevant information

  def invalid_fields # the entry point
    return @invalid_fields if @invalid_fields
    return [] unless request.post?
    return [] unless json_schema

    @raw_request_json = URI.decode_www_form(request.query_string).to_h.with_indifferent_access
    @invalid_fields = JSON::Validator.fully_validate(json_schema, raw_request_json)
  end

  def json_schema
    return @json_schema if @json_schema
    return {} unless File.exist?(schema_filename)

    raw_schema_string = File.read(schema_filename)
    @json_schema = JSON.parse(raw_schema_string)

    @json_schema
  end

  def schema_filename
    @schema_filename ||= "lib/schemas/client1_schema.json"
  end
wasaylor commented 4 years ago

having the same issue