biocompute-objects / BCO_Documentation

Repository for documentation to support the IEEE 2791-2020 standard. Please see our home page for communications/publications:
http://biocomputeobject.org/
BSD 3-Clause "New" or "Revised" License
16 stars 12 forks source link

[WIP] add json schema #34

Closed corburn closed 6 years ago

corburn commented 6 years ago

Following up on #5, this change introduces a draft BCO JSON schema.

The following script was used to test the schema against HCV1a.json:

import json
import jsonref
import os
import sys

from jsonschema import validate

def main():
    import argparse
    parser = argparse.ArgumentParser()
    parser.add_argument('json', type=argparse.FileType('r'), help="json to validate")
    parser.add_argument('schema', type=argparse.FileType('r'), help="root json schema to validate against")
    args = parser.parse_args()

    base_uri = 'file://{}/'.format(os.path.dirname(args.schema.name))

    data = json.load(args.json)
    schema = jsonref.load(args.schema, base_uri=base_uri, jsonschema=True)

    return validate(data, schema)

if __name__ == "__main__":
    main()

The following commands were used to run the script:

cd BCO_specification/
python -m venv env
source env/bin/activate
pip install jsonschema jsonref
python validate.py HCV1a.json $PWD/schemas/biocomputeobject.json

The following definition for extension_domain failed to trigger validation errors when tested with the above jsonschema script. It may be the library doesn't support it (propertyNames test cases are sparse) or the definition is invalid.

"extension_domain": {
  "type": "object",
  "propertyNames": {
    "$comment": "Need to 'anyOf' these or else the enum and pattern conflict",
    "$comment": "https://github.com/json-schema-org/json-schema-spec/issues/214#issue-197927069",
    "anyOf": [
      {"enum": ["fhir_extension", "github_extension"]},
      {"pattern": "*_extension"}
    ]
  }
},
corburn commented 6 years ago

@stain do you have recommendations for how to implement the extension_domain?

Currently I'm looking at propertyNames and haven't yet looked into the Hyper Schema or relative JSON pointers.


Commit f74abc2d8bb57a350222c2bbdf887e8928e99925 closed #21 by discarding validity constraints:

The extension_domain section is not evaluated by checks for BCO validity or computational correctness

mr-c commented 6 years ago

Here is a visualization of the schema so far screenshot from 2018-11-16 10-31-55

corburn commented 6 years ago

screen shot 2018-11-16 at 10 21 26 am

HadleyKing commented 6 years ago

@corburn How long would it take you to finish a version of the schema with out waiting for all of the discussion issues to be resolved? @rykahsay would like to implement a BCO tool he has started based on your schema work already but he does not want to replicate work or go in a different direction. He suggested that we define the remaining domains using what is in the spec so he can get his the tool working (which he will put in a repo here once it is done). Or he said he can pull what you have and use that if that would be easiest.

corburn commented 6 years ago

@HadleyKing the schema is current to the extent I can make it at this time.

mr-c commented 5 years ago

@corburn @HadleyKing Shall we transfer the remaining unchecked boxes to separate issues so they don't get lost?