ResearchObject / workflow-run-crate

Workflow Run RO-Crate profile
https://www.researchobject.org/workflow-run-crate/
Apache License 2.0
8 stars 9 forks source link

cwlprov_to_crate: support for nested workflows #22

Closed simleo closed 2 years ago

simleo commented 2 years ago

Workflows can run other workflows as subworkflows. CWLProv outputs separate provenance documents in this case, but such runs are not yet supported in cwlprov_to_crate. Functionally, we need to add the capability to parse the provenance metadata in this scenario. Then there's the issue of adding subworkflow metadata to the RO-Crate. In the relationship graph, subworkflows need to appear in the same place as tool wrappers (what's run by a step). Their type should be the same as the main workflow, minus File since they are stored as sections in packed.cwl:

["SoftwareSourceCode", "ComputationalWorkflow", "HowTo"]

Then we'd need to recursively convert all subworkflows as we did for the main one.

One possibly weird consequence is that some of the workflow components would be SoftwareApplications (the tool wrappers) while others would be of type SoftwareSourceCode (the subworkflows). I guess the reason for the presence of both entities in Schema.org is that the former should model an executable, while the latter should represent code that needs to be compiled. With interpreted languages such as CWL (or Python, etc.), however, the source code is also runnable, so the distinction is not so meaningful.

simleo commented 2 years ago

Done in #28