VertebrateResequencing / wr

High performance Workflow Runner
GNU General Public License v3.0
30 stars 12 forks source link

Task Execution Service - GA4GH APIs #117

Open keiranmraine opened 6 years ago

keiranmraine commented 6 years ago

WR is effectively already a TES. It already exposes a REST API, but it doesn’t conform to the GA4GH TES schema.

https://github.com/ga4gh/task-execution-schemas

Currently there are very few (if any) easily deployed auto-scaling TES implementations and this could be a very good example of one.

sb10 commented 6 years ago

wr doesn't currently deal with input/output file specs. It would be nice to have, but that's a big chunk of work. It might be necessary to implement this for CWL anyway.

Their REST API isn't so great. Why is their cancel "endpoint" on the tasks endpoint? Anyway, implementing their REST API would be relatively trivial. It would be in addition to what wr's current REST API, since it isn't as feature-rich as wr's.

Is there anyone out there who has stated an interest in actually using this API? Is there any software already written that calls this API?

sb10 commented 6 years ago

Hmmm, would it still conform to their spec if an authorization header was required?

keiranmraine commented 6 years ago

I think engaging with them and asking the questions is a good idea. I don't believe the spec is final. I know the purpose was to develop Workflow Execution Schema Services to use TES services (so TES were defined first). The idea being that if you have a TES which supports auto scaling in your environment you can pair it with any WES and not worry about the complexities of CWL step interdependencies as it's the WES's responsibility.

One example would be that it could be used as a TES for rabix executor, which would allow WR to be used for CWL workflows without being fully CWL compliant as the rabix component manages the job to job interdependencies and optimisation of the DAG. This is a really nice option as it gives a light touch development environment that can be used with small data (paired with rabix composer, drag 'n drop CWL builder).

They have examples using funnel but WR would be far simpler as it will autoscale:

http://docs.rabix.io/setting-up-rabix-executor-with-a-tes-server

sb10 commented 6 years ago

Jeff Gentry noted that "There’s a GA4GH wide authn spec being developed and it’ll be used for all APIs.". Will need to wait for that?...

sb10 commented 6 years ago

David Glazer noted that "it's very much okay for your implementation to require HTTPS and auth headers. (It's also okay for other implementations to be less strict, since the right answer depends on the environment.) As Jeff says there's an emerging GA4GH spec on auth specifics, which recommends a consistent way to use OIDC and OAuth."

kellrott said: "my team has been working on another TES implantation called Funnel ( https://github.com/ohsu-comp-bio/funnel ). For testing TES compatibility in the context of workflows, check out with Bunny (The Seven Bridges CWL runner) TES instructions https://github.com/rabix/bunny/wiki/Setting-up-Rabix-Executor-with-a-TES-server and for Cromwell https://cromwell.readthedocs.io/en/develop/backends/TES/ There is also https://github.com/common-workflow-language/cwl-tes (but I don't think it supports object stores yet). Between your work, Funnel and TESK ( https://github.com/EMBL-EBI-TSI/TESK ) we would have about 3 different implementations covering different infrastructure deployments. We should probably start working together to come up with a more formal conformance suite. "

Alex Buchanan said: "We have a test suite in Funnel (a TES implementation) here: https://github.com/ohsu-comp-bio/funnel/tree/master/tests

While those are Funnel specific currently, it's feasible to abstract away the Funnel-specific details into a more generic TES client interface. We also have CLI, Go, and Python clients, if you want.

https://github.com/ohsu-comp-bio/py-tes

https://github.com/ohsu-comp-bio/tes"