Duke-GCB / calrissian

CWL on Kubernetes
https://duke-gcb.github.io/calrissian/
MIT License
42 stars 13 forks source link

Allow requesting RO-Crate provenance (next cwltool release) #166

Open davidjsherman opened 5 months ago

davidjsherman commented 5 months ago

When we provide the results of Calrissian jobs as a deliverable, we include a RO-Crate that reports provenance of the results produced by the execution the CWL workflow. The provenance report is created by cwltool when the --provenancefolder option is specified.

Calrissian can pass through the option but this currently fails because Calrissian calls cwltool.main directly with preparsed arguments only and the current release of cwltool requires unparsed arguments as well when reporting provenance.

The requirement for unparsed arguments has been removed by common-workflow-language/cwltool#1964, and the change should be in the next cwltool release.

We could provide a PR with a unit test, as soon as the release is available.

davidjsherman commented 5 months ago

Note that there isn't a easy workaround in Calrissian itself. Because of the way that parse_arguments handles defaults, the complete list of unparsed arguments isn't available to be passed as argsl to cwltool.main. We could lie and provide any nonempty list in argsl — the list is only used by cwltool for logging — but that inconsistency would be revealed in the provenance logs.

fabricebrito commented 5 months ago

Hi @davidjsherman, thanks for the heads-up. We've been following the cwltool issue and PR. Happy to merge when @mr-c releases cwltool and you're ready with the PR on calrissian

mr-c commented 5 months ago

Here's the new cwltool release; enjoy! https://pypi.org/project/cwltool/3.1.20240112164112/ & https://github.com/common-workflow-language/cwltool/releases/tag/3.1.20240112164112

davidjsherman commented 5 months ago

@fabricebrito would you prefer the Calrissian-specific unit test in a separate file, or folded into tests/test_job.py?

fabricebrito commented 3 months ago

@davidjsherman a separate file is perfect