Open mr-c opened 6 years ago
This is a clever usage of the feature that helps with the last leg of a workflow execution. I think it will require modification directly of the OpenAPI description.
I'm pro-provenance in general, but would need some more details on what this might look like in the spec. Tagging as a v2.0 candidate for now.
To be more specific, the returned value would be the IRI/URI to a v0.6.0 or newer CWLProv ResearchObject https://w3id.org/cwl/prov/0.6.0
@mr-c to clarify for my sake -- your suggestion enables something like: 1) I'm doing a status check on a workflow 2) I see the status is a failure 3) Linked to the status through "provenance-uri" is the actual http link to the raw workflow definition associated to the failure -- which I can go to investigate? 4) ....
@ruchim Almost, for step 3 the URI points to a CWLProv document that would give detailed information, not the workflow definition (which one assumes the caller of the API already has a copy of)
ahhh! is the provenance-uri the same as stdout/stderr logs -- or something else? also, thanks so much for the quick response, really appreciate it.
The provinance-uri would point to a CWLProv format document which would contain structured data including raw logs, server information, etc.. 🙂👍
I think from https://www.w3.org/TR/prov-aq/#resource-accessed-by-http in PROV we kind of allowed any kind of provenance document, although one containing PROV in one of the several formats would be preferable.
I would not require all in CWLProv - that is kind of the inside of the workflow and could be exposed as well if present as a Research Object BagIt archive (as it would be multiple files) or as a directory of files exposed through the WES - there is no single "CWLProv document" as such, we have both primary.cwlprov.*
in multiple serializations, or metadata/manifest.json that types and links to all the other files.
Perhaps WES would have its own "outer" provenance that just says when the workflow job started/stopped and ideally links to its outputs?
Cool, did some reading on the links to catch up -- and thanks to Jeff Gentry for explaining CWLProv a little more deeply. My own thoughts are that these are really good best practices from the perspective of leveraging features of the http spec. If I put myself in the mindset of someone who runs workflows a lot, I'd absolutely need logs of my workflow run and a link to that log (whether it looks like a provenance object or not) and I'd expect a link to those logs directly in the API response (not just header, which I may never even know to check as comp bio rather than software engineer). So I see the logs as a necessity in the spec and the provenance_uri a bonus/competitive feature for providing structured details for debugging/tracking.
So it sounds like think this is something to add to WES documentation as a recommendation rather than a spec change. let me know if I misunderstood!
just poking @stain @mr-c @david4096 for any opinions to my comment above!
@ruchim Yep, it can go in the header and in the body of the response, agreed!
excellent, I'll mark this as a documentation change.
https://www.w3.org/TR/prov-aq/#resource-accessed-by-http
Idea is from @stain