openmhealth / schemas

A repository of Open mHealth schemas.
Apache License 2.0
71 stars 45 forks source link

schemas are missing id definition #6

Open chicco785 opened 7 years ago

chicco785 commented 7 years ago

All the schemas to be uri-referenciable from other schemas correctly, should have an id.

for example, the body-weight-1.0.json schema, by including the id "http://www.openmhealth.org/schema/omh/body-weight-1.0.json, would allow mass-unit-value-1.x.json to be correctly resolved, and hence the schema will be considered valid by validators when remotely referenced in other schemas. (e.g. http://www.jsonschemavalidator.net)

{
  "$schema": "http://json-schema.org/draft-04/schema#",
  "id": "http://www.openmhealth.org/schema/omh/body-weight-1.0.json",
  "description": "This schema represents a person's body weight, either a single body weight measurement, or for the result of aggregating several measurements made over time (see Numeric descriptor schema for a list of aggregate measures)",
  "type": "object",
  "references": [
    {
      "description": "The SNOMED code represents Body weight measure (observable entity)",
      "url": "http://purl.bioontology.org/ontology/SNOMEDCT/363808001"
    }
  ],
  "definitions": {
    "mass_unit_value": {
      "$ref": "mass-unit-value-1.x.json"
    },
    "time_frame": {
      "$ref": "time-frame-1.x.json"
    },
    "descriptive_statistic": {
      "$ref": "descriptive-statistic-1.x.json"
    }
  },
  "properties": {
    "body_weight": {
      "$ref": "#/definitions/mass_unit_value"
    },
    "effective_time_frame": {
      "$ref": "#/definitions/time_frame"
    },
    "descriptive_statistic": {
      "$ref": "#/definitions/descriptive_statistic"
    }
  },
  "required": [
    "body_weight"
  ]
}
emersonf commented 7 years ago

Thanks for filing this @chicco785.

We've looked into adding IDs. At first glance, adding the property causes validation to fail on work-in-progress schemas which may not yet be pushed to external repos. The validator we use hasn't been updated in a few years, and doesn't allow turning off using the $id as a base URL so we'll need to dig deeper to find a solution that doesn't break our workflow.

chicco785 commented 7 years ago

Hi @emersonf, I think you can solve the problem of work-in-progress schemas passing directly the schemas to the validator, without the need for the uri de-referentiation. I am using ajv in our continuous integration based on travis-ci and github. It works pretty fine, and it's well maintained.

emersonf commented 7 years ago

@chicco785 the WIP issue occurs when WIP schema A contains a $ref to WIP schema B, which happens quite often. If A contains an $id, our validator tries to find $id/B, even though B hasn't been released, and fails. Our options are to remove $id whenever A is being worked on, or to use a different validator.

Thanks for the pointer to ajv. Do you know if it can be configured to ignore $id at runtime? That would solve the issue. Although our validation tooling is in Java (you can take a look, it's part of this repo), so it might be difficult to integrate.

chicco785 commented 7 years ago

@emersonf, i think you can take two approaches with ajv. 1 - ask to ignore referenced schemas (--missing-refs= true/ignore/fail) 2 - load manually the schema with the -r option. Suppose you want to validate schema http://my-schema/a and that it contains a reference to http://my-schema/b and that you pass with the option -rthe file that has as id http://my-schema/b, in this this case ajv before trying to retrieve a schema at the http://my-schema/b check the id of passed by file schemas. this will allow you to validate your WIP schemas without giving up on schema references.

chicco785 commented 7 years ago

I had a look at https://github.com/openmhealth/schemas/tree/master/test-data-validator, but I am not sure you are doing anything more than validating schemas agains json files. If that's the only thing you are doing with that code, probably you can replace it completely with a nodejs script, without changing at all the structure of your test-data folder (of course better testing). Question is if and how the test code is integrated with other tools. If you are interested just in the validation report, that's should be quite trivial.

In my experience, it is also very nice to integrate with ci-travis, also in relation to pull request and so on. e.g. a pull request can only be accepted if all tests pass.