The procedure that you can follow to integrate the above tools into your processes are:
Learn schemasheets syntax, which in turn uses linkml syntax and port the contents from the current HTAN.model.csv file into a schemasheets conformant CSV
For example, an HTAN Assay would become a linkml class, and something like Assay Type would be a linkml slot on that class. This can very conveniently be represented using schemasheets syntax as well. In the same way that you enforce valid values in the HTAN schema, you can constrain the values an assay type can take using linkml enums
Another example can be the Patient type. That would be a linkml class. And a property on that class would be something like gender
Pro: using schemasheets and linkml gives you support from developers across the linkml community to help in the maintenance of schemas. These tools are being used in other NIH and DOE funded projects like CCDH, NMDC, GA4GH, biolink, etc.
linkml has a number of generators baked into it. One of them is a JSON LD generator. So essentially, you would be able to generate HTAN.model.jsonld using linkml
HTAN.yaml being the linkml YAML data model in your workflow. The first part of the conversion process would be taken care of my the schemasheets cli commands and the second part of the process, i.e., generation of JSON LD would be taken care of the linkml generator: gen-jsonld
Pro: this could help you take advantage of the variety of other generators that are baked into the linkml generator framework like
Python dataclasses which can allow the HTAN schema to be made deployable to PyPI and allow users to import and use the package in their scripts
JSON Schema generator, which can be invoked by running gen-json-schema
various other artefacts like Excel, SHEX, SHACL, etc.
Note: Since I'm still subscribed to this repo, I receive email updates every now and then. I was seeing activity here and I thought I should create an issue to bring to your attention some tools that can simplify some of the processes. Please don't feel any pressure or obligation to jump on it with priority. I was just creating an issue since it might prove to be beneficial at some point in the future.
Thanks @sujaypatil96! I know this has also been raised with @milen-sage and the schematic team in schematic:#631, so I'll let this conversation carry on there.
Tools being recommended:
The procedure that you can follow to integrate the above tools into your processes are:
HTAN.model.csv
file into a schemasheets conformant CSVAssay
would become a linkml class, and something likeAssay Type
would be a linkml slot on that class. This can very conveniently be represented using schemasheets syntax as well. In the same way that you enforce valid values in the HTAN schema, you can constrain the values an assay type can take using linkml enumsPatient
type. That would be a linkml class. And a property on that class would be something likegender
Pro: using schemasheets and linkml gives you support from developers across the linkml community to help in the maintenance of schemas. These tools are being used in other NIH and DOE funded projects like CCDH, NMDC, GA4GH, biolink, etc.
HTAN.model.jsonld
using linkmlYour workflow would go like this:
HTAN.yaml being the linkml YAML data model in your workflow. The first part of the conversion process would be taken care of my the schemasheets cli commands and the second part of the process, i.e., generation of JSON LD would be taken care of the linkml generator:
gen-jsonld
Pro: this could help you take advantage of the variety of other generators that are baked into the linkml generator framework like
gen-json-schema
Note: Since I'm still subscribed to this repo, I receive email updates every now and then. I was seeing activity here and I thought I should create an issue to bring to your attention some tools that can simplify some of the processes. Please don't feel any pressure or obligation to jump on it with priority. I was just creating an issue since it might prove to be beneficial at some point in the future.
Related: https://github.com/Sage-Bionetworks/schematic/issues/631
CC: @milen-sage @adamjtaylor @cmungall