brain-bican / models

BICAN data models
https://brain-bican.github.io/models/
3 stars 3 forks source link

A yaml version of the JSON schema CCN2 #2

Closed neurovium closed 5 months ago

neurovium commented 1 year ago

I restructured the json file since it was not written in a structured fashion.

djarecka commented 1 year ago

@neurovium - the linkml has a specific syntax and not all keywords from json file could be used when writing linkml schema, e.g. allow_additional, minimum, enum, etc. One can test the schema by running their converter, e.g. to recreate json schema, i.e. gen-json-schema (these tools are not perfect yet, but they will definitely catch most of the issues)

neurovium commented 1 year ago

@neurovium - the linkml has a specific syntax and not all keywords from json file could be used when writing linkml schema, e.g. allow_additional, minimum, enum, etc. One can test the schema by running their converter, e.g. to recreate json schema, i.e. gen-json-schema (these tools are not perfect yet, but they will definitely catch most of the issues)

@djarecka, Interesting. The yaml file was a valid one (tested here: https://onlineyamltools.com/validate-yaml and here https://jsonformatter.org/yaml-to-jsonschema). But it is interesting that linkml has some peculiarities. I fixed the minimum (to minimum_value) and enum is now under the enumeration section. Though I have no solution for additionalProperties of json for linkml. Even though general yaml to json convertors recognize allow_additional as an equivalent flag, linkml does not. I noticed that you also didn't include it in your version of linkml yaml. I am still trying to find a solution for this.

ps. here is the output of the "gen-json-schema ccn2_schema.yaml":

INFO:root:Importing linkml:types as /home/nima/anaconda3/lib/python3.9/site-packages/linkml_runtime/linkml_model/model/schema/types from source ccn2_schema.yaml { "$defs": { "CellSetAccessionToCellMapping": { "additionalProperties": false, "description": "", "properties": { "cell_accessions": { "description": "List of cell set accession identifiers.", "items": { "type": "string" }, "type": "array" }, "sample": { "description": "Cell sample identifier.", "type": "string" } }, "required": [ "cell_accessions", "sample" ], "title": "CellSetAccessionToCellMapping", "type": "object" }, "CrossTaxonomyMapping": { "additionalProperties": false, "description": "", "properties": { "cell_set_accession": { "description": "Primary identifier of the cell set. This field should be programmatically assigned, not edited.", "type": "string" }, "cell_type_name": { "description": "The primary name/symbol to be used for the (provisional) cell type defined by this cell set. This is left optional, but is strongly encouraged for every node that is linked.", "type": "string" }, "evidence_comment": { "description": "A free text description of the evidence supporting this mapping. If a similarity_score is include, please also include details of how this was calculated.", "type": "string" }, "mapped_cell_set_accession": { "description": "The accession (ID) of a cell set in a second taxonomy that this cell set maps to.", "type": "string" }, "mapped_cell_type_name": { "description": "The name of the cell type corresponding to the mapped_cell_set_accession.", "type": "string" }, "provenance": { "description": "ORCID of the person doing the mapping using the syntax ORCID:0123-4567-890. Optionally include supporting publications using DOIs of the form doi:10.1126/journal.abj6641.", "type": "string" }, "similarity_score": { "description": "A score recording the similarity between mapped nodes.", "maximum": 1, "minimum": 0, "type": "number" } }, "required": [ "cell_set_accession", "cell_type_name", "evidence_comment", "mapped_cell_set_accession", "mapped_cell_type_name" ], "title": "CrossTaxonomyMapping", "type": "object" }, "LocationMapping": { "additionalProperties": false, "description": "", "properties": { "cell_set_accession": { "description": "Primary identifier of the cell set. This field should be programmatically assigned, not edited.", "type": "string" }, "cell_type_name": { "description": "The primary name/symbol to be used for the (provisional) cell type defined by this cell set. This is left optional, but is strongly encouraged for every node that is linked.", "type": "string" }, "evidence_comment": { "description": "A free text description of the evidence supporting this mapping. If a similarity_score is include, please also include details of how this was calculated.", "type": "string" }, "location_ontology_term_id": { "description": "The ID of an ontology term that refers to a brain region that this cell type is located in. Ideally this should be the ID of a term defined as a region in a standard atlas.", "type": "string" }, "location_ontology_term_name": { "description": "Name of the term whose ID is recorded in the ontology_term_id field.", "type": "string" }, "provenance": { "description": "ORCID of the person doing the mapping using the syntax ORCID:0123-4567-890. Optionally include supporting publications using DOIs of the form doi:10.1126/journal.abj6641.", "type": "string" }, "supporting_data": { "description": "A link to data supporting this location mapping.", "type": "string" } }, "required": [ "cell_set_accession", "cell_type_name", "location_ontology_term_id", "location_ontology_term_name", "provenance" ], "title": "LocationMapping", "type": "object" }, "RankEnum": { "description": "", "enum": [ "leaf_node", "family", "genus" ], "title": "RankEnum", "type": "string" }, "Taxonomy": { "additionalProperties": false, "description": "", "properties": { "cell_set_accession": { "description": "Primary identifier of the cell set. This field should be programmatically assigned, not edited.", "type": "string" }, "cell_type_name": { "description": "The primary name/symbol to be used for the (provisional) cell type defined by this cell set. This is left optional, but is strongly encouraged for every node that is linked.", "type": "string" }, "classification_comment": { "description": "A free text comment describing the evidence for this classification.", "type": "string" }, "classification_provenance": { "description": "Either the DOI(s) of a supporting publication (in the form the form doi:10.1126/journal.abj6641) or the editor's ORCID (in the form: ORCID:01243-234-678). Multiple entries should be separated by a '|'.", "type": "string" }, "classifying_ontology_term_id": { "description": "The ID of an ontology term that classifies the cell type defined by this node.", "type": "string" }, "classifying_ontology_term_name": { "description": "The name of the ontology term in the classification_id column", "type": "string" }, "description": { "description": "Optional free text description of the cluster. This could be particularly useful for describing the properties of cells clustered from techniques that provide data on morphology, function and connectivity, e.g. patch-seq & epi-retro-seq.", "type": "string" }, "parent_cell_set_accession": { "description": "The cell set accession of the parent cell set in the taxonomy. This field should be programmatically assigned, not edited.", "type": "string" }, "rank": { "$ref": "#/$defs/RankEnum", "description": "Algorithmically generated hierarchical taxonomies can be complex, with many nodes between root and leaf and branches of variable depth. To simplify this for display and discussion it can be useful to assign nodes to a 3 level hierarchy, with leaf nodes at the bottom." }, "synonym_provenance": { "description": "Each entry in the synonyms field should have a corresponding entry here, either the DOI of a supporting publication (in the form the form doi:10.1126/journal.abj6641) or the editor's ORCID (in the form: ORCID:01243-234-678). Multiple entries should be separated by a '|'.", "type": "string" }, "synonyms": { "description": "A list of alternative names for this cell type. Separate entries with a '|'. Do not use terms with a scope that is much narrower or broader than the cell type being described.", "type": "string" } }, "required": [ "cell_set_accession", "classifying_ontology_term_name", "classification_provenance", "parent_cell_set_accession" ], "title": "Taxonomy", "type": "object" } }, "$id": "CCN2", "$schema": "http://json-schema.org/draft-07/schema#", "additionalProperties": true, "metamodel_version": "1.7.0", "title": "CCN2", "type": "object", "version": null }

djarecka commented 1 year ago

yaml validator only validates against general format rules, has no knowledge how will you use the file. Each software that relies on the content of the yaml file has to have some rules to be able to read and understand the content.

neurovium commented 1 year ago

yaml validator only validates against general format rules, has no knowledge how will you use the file. Each software that relies on the content of the yaml file has to have some rules to be able to read and understand the content.

@djarecka the current version: https://github.com/neurovium/models/blob/patch-1/ccn2_schema/ccn2_schema.yaml (same as patch 1 in the pull request) does address linkml requisites, including "gen-json-schema".