Right now we're just using File for everything, however it would be good to include more information like format. The trouble with that is we don't have any ontological mappings for QIIME 2 types yet.
Something that seems super applicable from the spec:
Reasoning about format compatability must be done by checking that an input file format is the same, owl:equivalentClass or rdfs:subClassOf the format required by the input parameter.
owl:equivalentClass is transitive with rdfs:subClassOf, e.g. if \ owl:equivalentClass \ and \ owl:subclassOf \ then infer \ owl:subclassOf \.
The subclassOf would be an ideal mechanism for describing inputs, however it isn't clear to me how a given input file would be identified as a member of a format? Is this the job of the CWL runner to determine? QIIME 2 artifacts are very easy to introspect so assuming an ontology in RDF/OWL format did exist, how would we perform identification of some arbitrary file?
Relatedly, while composing q2cwl tools should work pretty well within .qza/.qzv files (which is what our end-users will expect) we also need to support exporting/importing (via more generated tools) for composition with other cwl tools. Ideally formats will exist to make verifying workflows possible/discoverable.
I think ultimately what this might all look like is q2cwl generated tools for some plugin will accept an EDAM data type (with subclass-ing mechanisms to describe the semantic subtyping) and q2cwl import/export tools will convert that data type to an EDAM format which will hopefully play well with everything else that is similarly well-described.
Right now we're just using
File
for everything, however it would be good to include more information likeformat
. The trouble with that is we don't have any ontological mappings for QIIME 2 types yet.x-ref: https://github.com/edamontology/edamontology/issues/365
Something that seems super applicable from the spec:
The subclassOf would be an ideal mechanism for describing inputs, however it isn't clear to me how a given input file would be identified as a member of a format? Is this the job of the CWL runner to determine? QIIME 2 artifacts are very easy to introspect so assuming an ontology in RDF/OWL format did exist, how would we perform identification of some arbitrary file?
Relatedly, while composing q2cwl tools should work pretty well within .qza/.qzv files (which is what our end-users will expect) we also need to support exporting/importing (via more generated tools) for composition with other cwl tools. Ideally formats will exist to make verifying workflows possible/discoverable.
I think ultimately what this might all look like is q2cwl generated tools for some plugin will accept an EDAM data type (with subclass-ing mechanisms to describe the semantic subtyping) and q2cwl import/export tools will convert that data type to an EDAM format which will hopefully play well with everything else that is similarly well-described.
cc @mr-c @matuskalas @johnbradley