raml-org / raml-js-parser-2

(deprecated)
Other
138 stars 53 forks source link

Resolution of jsonschema $ref's #481

Closed coltonlw closed 7 years ago

coltonlw commented 7 years ago

Does raml-js-parser-2 provide an interface for resolving jsonschema $ref's? Specifically I would like to resolve references against the local filesystem, relative to the path of the referring schema. I see this is implemented in the api-workbench and I see code in raml-js-parser-2 to deal with references, but I'm not sure which function I could call to resolve them.

coltonlw commented 7 years ago

I see the Project class of jsyaml2lowlevel has some methods that look like they do what I want - keeping track of resolution scope as it relates to the location of the raml document and referring documents. For example, the resolve() method of the Project class: https://github.com/raml-org/raml-js-parser-2/blob/master/src/raml1/jsyaml/jsyaml2lowLevel.ts#L761

I see also the unit method of Project class calls startDownloadingReferencesAsync from https://github.com/raml-org/raml-js-parser-2/blob/master/src/util/schemaAsync.ts

If I have a BodyLike object with a schema, https://raml-org.github.io/raml-js-parser-2/interfaces/_src_raml1_artifacts_raml08parserapi_.bodylike.html, how can I resolve the $ref's in this schema?

coltonlw commented 7 years ago

I am looking to implement this behavior in https://github.com/cybertk/abao , so that it parses references the same as the api-workbench, specifically $ref's that are relative paths to the local filesystem. Abao is an important part of the raml ecosystem and having it parse $ref's the same as the api-workbench would be a big step forward, especially since the PR to abao for this will also upgrade abao to using raml-1-parser, paving the way for RAML 1.0 support in the tool. Any help here very much appreciated

cascer1 commented 7 years ago

I would also love for this feature to be implemented, it's very complicated to parse JSON schema's based on the parser output since I don't know their original location and as such cannot resolve relative $ref's.

Is there anything I can do to help development of this feature?

sichvoge commented 7 years ago

My question would be around how the parser should handle references by default. Should it resolve it immediately or give you some options to do that? How should that look like and from your perspective and how would you envision to use it?

sichvoge commented 7 years ago

Obviously, it isn't the job of a parser to do such kind of things. So I'd really like to understand the use case for resolving the $refs inside the parser vs retrieving the schema and use any dereference module available for JSON schemas.

cascer1 commented 7 years ago

Currently, when I ask for types or schema's I get a name, displayName and "type" containing the actual JSON schema:

{
  "ErrorResponse": {
    "name": "ErrorResponse",
    "displayName": "ErrorResponse",
    "type": [
      "{\n  \"$schema\" : \"http://json-schema.org/draft-04/schema#\",\n  \"title\" : \"class ErrorResponse\",\n  \"type\" : \"object\",\n  \"properties\" : {\n    \"errorId\" : {\n      \"type\" : \"string\"\n    },\n    \"errors\" : {\n      \"type\" : \"array\",\n      \"items\" : {\n        \"$ref\" : \"definitions.json#/definitions/APIError\"\n      },\n      \"minItems\" : 1,\n      \"uniqueItems\" : false\n    }\n  },\n  \"required\" : [ \"errorId\", \"errors\" ]\n}"
    ],
    "required": true,
    "__METADATA__": {
      "primitiveValuesMeta": {
        "displayName": {
          "calculated": true
        },
        "required": {
          "insertedAsDefault": true
        }
      }
    }
  }
}

As you can see, the actual schema contains a $ref entry, but I have no idea where the 'definitions.json' file is located, since I don't know from where this was included. Because of this I am unable to dereference it myself.

A more complex example:

in my spec.raml file, I have the following (snipped):

types:
  CreatePaymentRequest: !include payment/CreatePaymentRequest.json

payment/CreatePaymentRequest.json

The output from the parser is as follows:

{
  "CreatePaymentRequest": {
    "name": "CreatePaymentRequest",
    "displayName": "CreatePaymentRequest",
    "type": [
      "{\n  \"$schema\" : \"http://json-schema.org/draft-04/schema#\",\n  \"title\" : \"class CreatePaymentRequest\",\n  \"allOf\" : [ {\n    \"type\" : \"object\",\n    \"properties\" : {\n      \"encryptedCustomerInput\" : {\n        \"type\" : \"string\"\n      },\n      \"fraudFields\" : {\n        \"$ref\" : \"../definitions.json#/definitions/FraudFields\"\n      },\n      \"order\" : {\n        \"$ref\" : \"definitions.json#/definitions/Order\"\n      }\n    },\n    \"required\" : [ \"order\" ]\n  }, {\n    \"oneOf\" : [ {\n      \"properties\" : {\n        \"cashPaymentMethodSpecificInput\" : {\n          \"$ref\" : \"definitions.json#/definitions/CashPaymentMethodSpecificInput\"\n        }\n      },\n      \"required\" : [ \"cashPaymentMethodSpecificInput\" ]\n    }, {\n      \"properties\" : {\n        \"directDebitPaymentMethodSpecificInput\" : {\n          \"$ref\" : \"definitions.json#/definitions/NonSepaDirectDebitPaymentMethodSpecificInput\"\n        }\n      },\n      \"required\" : [ \"directDebitPaymentMethodSpecificInput\" ]\n    }, {\n      \"properties\" : {\n        \"sepaDirectDebitPaymentMethodSpecificInput\" : {\n          \"$ref\" : \"definitions.json#/definitions/SepaDirectDebitPaymentMethodSpecificInput\"\n        }\n      },\n      \"required\" : [ \"sepaDirectDebitPaymentMethodSpecificInput\" ]\n    }, {\n      \"properties\" : {\n        \"redirectPaymentMethodSpecificInput\" : {\n          \"$ref\" : \"definitions.json#/definitions/RedirectPaymentMethodSpecificInput\"\n        }\n      },\n      \"required\" : [ \"redirectPaymentMethodSpecificInput\" ]\n    }, {\n      \"properties\" : {\n        \"cardPaymentMethodSpecificInput\" : {\n          \"$ref\" : \"definitions.json#/definitions/CardPaymentMethodSpecificInput\"\n        }\n      },\n      \"required\" : [ \"cardPaymentMethodSpecificInput\" ]\n    }, {\n      \"properties\" : {\n        \"invoicePaymentMethodSpecificInput\" : {\n          \"$ref\" : \"definitions.json#/definitions/InvoicePaymentMethodSpecificInput\"\n        }\n      },\n      \"required\" : [ \"invoicePaymentMethodSpecificInput\" ]\n    }, {\n      \"properties\" : {\n        \"bankTransferPaymentMethodSpecificInput\" : {\n          \"$ref\" : \"definitions.json#/definitions/BankTransferPaymentMethodSpecificInput\"\n        }\n      },\n      \"required\" : [ \"bankTransferPaymentMethodSpecificInput\" ]\n    }, {\n    } ]\n  } ]\n}"
    ],
    "required": true,
    "__METADATA__": {
      "primitiveValuesMeta": {
        "displayName": {
          "calculated": true
        },
        "required": {
          "insertedAsDefault": true
        }
      }
    }
  }
}

As you can see, the output from the parser contains $refs to both definitions.json and ../definitions.json. Is there a way for me to find out relative to what these paths are? If there is, I'd happily use that to dereference the JSON schemas myself, but as far as I can tell, the parser output does not provide me with such information.

I would like it if the parser did one of the following:

  1. dereference the JSON schema's, either automatically or after calling a function such as dereference()
  2. provide me with the path of the original JSON schema so that I can dereference it myself

For me, it would obviously be less work if the parser already dereferenced the schema for me, I understand if that would be out of scope, but providing me with the path would be helpful.

coltonlw commented 7 years ago

@sichvoge The use case for involving the RAML parser in $ref resolution is so that when a jsonschema is loaded with "!include" the directory of the "!include" path is used as the initial resolution scope. It would be a convenience to prevent the caller from inspecting the RAML AST, getting the directory, and passing it to the ref resolver. The ref resolver also needs to track the resolution scope and resolve nested $refs relative to the referring document, and not the initial scope of the !include directory. This is the behavior I see in the api-workbench, I haven't been able to nail down how exactly they're resolving $refs, I think it involves the functions in raml-js-parser-2 I linked to in my initial comment. https://github.com/mulesoft/api-workbench/issues/314

coltonlw commented 7 years ago

Looked into this some more, here's what I see:

raml-js-parser-2 does include utilities for working with jsonschemas, including ref resolving. see util/schemaAsync https://github.com/raml-org/raml-js-parser-2/blob/master/src/util/schemaAsync.ts#L23 https://github.com/raml-org/raml-js-parser-2/blob/master/src/util/schemaAsync.ts#L11

This calls raml-definition-system.getSchemaUtils, which is just a passthrough that returns raml typesystem schemaUtil https://github.com/raml-org/raml-definition-system/blob/master/src/definitionSystem.ts#L5 https://github.com/raml-org/typesystem-ts/blob/master/src/schemaUtil.ts

After figuring out that schemaAsync from raml-js-parser-2 knows how to call code that resolves refs, I looked to see that schemaAsync is required in jsyaml2lowLevel and called in two places

https://github.com/raml-org/raml-js-parser-2/blob/master/src/raml1/jsyaml/jsyaml2lowLevel.ts#L14 https://github.com/raml-org/raml-js-parser-2/blob/master/src/raml1/jsyaml/jsyaml2lowLevel.ts#L1115 https://github.com/raml-org/raml-js-parser-2/blob/master/src/raml1/jsyaml/jsyaml2lowLevel.ts#L2209

I still could be missing something but I found these functions interesting and hopefully they help

sichvoge commented 7 years ago

@ddenisenko can you add more clarifications here, and maybe an explicit example on how you dereference JSON refs? What's our opinion about the parser is dereferencing refs in JSON schemas vs providing enough information so you can put something on top of the parser output?

I think @cascer1 got a very valid example.

ddenisenko commented 7 years ago

@cascer1 this sounds reasonable, at least adding a file path for the schema could provide a way to dereference a scheme.

@coltonlw , @cascer1 We'll see if we can expose the derefencing or provide some useful instructions.

sichvoge commented 7 years ago

@ddenisenko can you create necessary issues for that please. we can them into one of our sprints at some point.

cascer1 commented 7 years ago

Is there an indication of when I can expect the path to the schema (or fully dereferenced schema) in the parser?

dreamflyer commented 7 years ago

SchemaPath property is added for types that was loaded from JSON or XML schema files.

Example part of parser output:

{ "MyType": { "name": "MyType", "displayName": "MyType", "schemaPath": "subdir/scheme.json", "type": [ "{\n \"$schema\":\"http://json-schema.org/draft-04/schema\",\n \"type\": \"object\",\n \"required\":[\"parentName\"],\n \"properties\":{\n \"parentName\": {\"type\": \"string\"},\n \"child\": {\"$ref\": \"subdir/scheme.json#\"}\n }\n}\n" ], "required": true, "__METADATA__": { "primitiveValuesMeta": { "displayName": { "calculated": true }, "required": { "insertedAsDefault": true } } } } }

Also , "dereference(schemaPath, jsonReference)" function was added in /raml-js-parser-2/src/schema.ts.

Example:

schema.dereference("/my/favorite/schema.json", "../another/schema.json#/definitions/myType");

output is: "/my/another/schema.json#/definitions/myType"