Open uniqueg opened 2 years ago
If a workflow is a file then DRS would be usable for this TRS use case as it stands. DRS can handle any payload*
Versioning was/is a DRS concern but you wouldn't find much about it in the spec. That's intentional as it was determined to mostly be a separate concern. There are ways its dealt with, but beyond a quick response like this to detail them.
*That's not to say there aren't still issues with how DRS indicates payloads of different types.
@ianfore Sure, we can discuss whether workflow_url
should accept DRS URIs as well (should be fine for individual files), but I don't think they are a good alternative to TRS URIs here, and certainly not a reason not to support them. TRS and TRS URIs have been defined precisely for the purpose of accessing workflows and associated metadata, which is something that WES needs to do. They allow fetching all files associated with workflows, not just descriptors, as well as metadata, versioning etc. Sure, it's all possible to do that in DRS, too, but it's not an optimal fit, at least not right now, and it would likely require specific changes to DRS that may be undesirable or are, at the very least, far off on the horizon. TRS implementations have been around and in production for years.
@uniqueg I think this is currently left up to individual implements to decide. I agree with you that TRS was specifically designed to solve the problems so in an interconnected GA4GH world it makes sense that any WES should be able to accept a TRS URI as the workflow description. I think it also promotes sharing best practices workflows where possible (Ie validated dockstore workflows) instead of relying on having to define them always yourself. Building this ga4gh ecosystem out so that everything flows naturally into one another is a great idea.
As for DRS, I would say I do not see anyhting wrong with idea conceptually, but maybe we can open another issue for that specifically?
Thanks @patmagee. Indeed, a WES currently can implement it - as we have done for WESkit. So is option (1) in the OP your preference? Because the rest of what you write more goes towards option (2), or even with (3) as a perspective for a future major release?
Current situation
Similar to DRS URIs, TRS URIs have been proposed to be used as unique identifiers for resources on TRS services, which may include workflows (note the open PR for adding versioned TRS URIs to identify a specific tool/workflow version).
AS TRS offers ways of fetching all files associated with a workflow (descriptors, test files, other files), passing a versioned TRS URI should be sufficient to enable a workflow engine fetch a workflow from a TRS instance. To my current knowledge, the current specs do not specifically forbid the use of
trs://
schema URLs/URIs, so the point of this issue is to discuss if, in an effort to increase crosslinks between GA4GH Cloud API specs, we should specifically recommend or even mandate WES implementations to support TRS URIs.Available options
I will start this discussion by adding some advantages/disadvantages for each scenario:
Of course, an option would be to recommend this in a future minor WES release, then mandate it in the next major release (which would be my own preference, and I'd be happy to provide a PR for recommending the use of TRS URIs once this issue has had some feedback or ideally consensus).
Implementations
For more context, WESkit (a WES implementation for Snakemake and Nextflow) is currently implementing this here. There is also a Python-based TRS client library that people may find useful if they want to implement TRS support in Python-based WES implementations (we may add a command-line version, too, if there's some demand).