Duke-GCB / calrissian

CWL on Kubernetes
https://duke-gcb.github.io/calrissian/
MIT License
44 stars 17 forks source link

Allow requesting RO-Crate provenance (next cwltool release) #166

Open davidjsherman opened 10 months ago

davidjsherman commented 10 months ago

When we provide the results of Calrissian jobs as a deliverable, we include a RO-Crate that reports provenance of the results produced by the execution the CWL workflow. The provenance report is created by cwltool when the --provenancefolder option is specified.

Calrissian can pass through the option but this currently fails because Calrissian calls cwltool.main directly with preparsed arguments only and the current release of cwltool requires unparsed arguments as well when reporting provenance.

The requirement for unparsed arguments has been removed by common-workflow-language/cwltool#1964, and the change should be in the next cwltool release.

We could provide a PR with a unit test, as soon as the release is available.

davidjsherman commented 10 months ago

Note that there isn't a easy workaround in Calrissian itself. Because of the way that parse_arguments handles defaults, the complete list of unparsed arguments isn't available to be passed as argsl to cwltool.main. We could lie and provide any nonempty list in argsl — the list is only used by cwltool for logging — but that inconsistency would be revealed in the provenance logs.

fabricebrito commented 10 months ago

Hi @davidjsherman, thanks for the heads-up. We've been following the cwltool issue and PR. Happy to merge when @mr-c releases cwltool and you're ready with the PR on calrissian

mr-c commented 10 months ago

Here's the new cwltool release; enjoy! https://pypi.org/project/cwltool/3.1.20240112164112/ & https://github.com/common-workflow-language/cwltool/releases/tag/3.1.20240112164112

davidjsherman commented 10 months ago

@fabricebrito would you prefer the Calrissian-specific unit test in a separate file, or folded into tests/test_job.py?

fabricebrito commented 7 months ago

@davidjsherman a separate file is perfect

fabricebrito commented 2 months ago

@davidjsherman is this issue still relevant?

Linked to this issue, I've submitted a query here https://cwl.discourse.group/t/cwltool-provenance-worfklow-cwl-main-vs-workflow-cwl/948

davidjsherman commented 2 months ago

@fabricebrito Yes, but making the PR has unfortunately been sitting in our backlog. I'll see whether I can nudge that along