qiime2 / q2cwl

Prototype interface for automatically generating CWL tools from QIIME 2 actions
BSD 3-Clause "New" or "Revised" License
8 stars 7 forks source link

Handle input/output files more idiomatically. #4

Open ebolyen opened 5 years ago

ebolyen commented 5 years ago

Right now we're just using File for everything, however it would be good to include more information like format. The trouble with that is we don't have any ontological mappings for QIIME 2 types yet.

x-ref: https://github.com/edamontology/edamontology/issues/365

Something that seems super applicable from the spec:

Reasoning about format compatability must be done by checking that an input file format is the same, owl:equivalentClass or rdfs:subClassOf the format required by the input parameter. owl:equivalentClass is transitive with rdfs:subClassOf, e.g. if \ owl:equivalentClass \ and \ owl:subclassOf \ then infer \ owl:subclassOf \.

The subclassOf would be an ideal mechanism for describing inputs, however it isn't clear to me how a given input file would be identified as a member of a format? Is this the job of the CWL runner to determine? QIIME 2 artifacts are very easy to introspect so assuming an ontology in RDF/OWL format did exist, how would we perform identification of some arbitrary file?


Relatedly, while composing q2cwl tools should work pretty well within .qza/.qzv files (which is what our end-users will expect) we also need to support exporting/importing (via more generated tools) for composition with other cwl tools. Ideally formats will exist to make verifying workflows possible/discoverable.

I think ultimately what this might all look like is q2cwl generated tools for some plugin will accept an EDAM data type (with subclass-ing mechanisms to describe the semantic subtyping) and q2cwl import/export tools will convert that data type to an EDAM format which will hopefully play well with everything else that is similarly well-described.

cc @mr-c @matuskalas @johnbradley

mr-c commented 5 years ago

Users (perhaps assisted by software) identify the format of Files upon submission: http://www.commonwl.org/user_guide/16-file-formats/index.html