crim-ca / weaver

Weaver: Workflow Execution Management Service (EMS); Application, Deployment and Execution Service (ADES); OGC API - Processes; WPS; CWL Application Package
https://pavics-weaver.readthedocs.io
Apache License 2.0
26 stars 6 forks source link

Command line CWL runner #363

Closed fmigneault closed 2 years ago

fmigneault commented 2 years ago

Description

Weaver should provide a CLI CWL runner.

Its job would consist into doing the following steps:

  1. deploy CWL file(s) as Application Package into a WPS/OGC-API process(es) (many if main file is class: Workflow)
  2. execute the relevant process as job
  3. monitor the job execution until completion
  4. return the results (either direct JSON response or CWL outputs formatted)

This is essentially the same operations that are defined in current Workflow tests. Changes required are to decouple the operations from pytest structure and add argparse/client class instead.

Motivation

This would serve many purposes:

edit: moved #381 in separate issue

dbyrns commented 2 years ago

Given the size of Weaver, what about having a small package that could be used directly in notebooks, by Weaver internally in tests or for calling ADES or in birdy which is using owslib. Given the maturity of the new WPS/OGC-API process, maybe a contribution to owslib could also be evaluated?

fmigneault commented 2 years ago

I don't think there are any advantages to do it outside Weaver, since CWL is still disputed as the "official Application Package". It would be easier to track compatibility between features supported by Weaver and its CLI.

Furthermore, most of the operations are already done here: https://github.com/crim-ca/weaver/blob/fix-workflow-job-step-output/weaver/processes/wps3_process.py#L61-L416

Really only need to connect the deploy, execute, monitor and stage_job_results to a CLI. It could be worth having a "basic" install though. When installing Weaver, minimal packages would be only requests related.

Definitely for higher scope of calling any WPS, it should be external. That component should be able to communicate with Weaver with already deployed processes, maybe even use Weaver-CLI as an extension. Deployment itself is not core to OGC-API processes.

dbyrns commented 2 years ago

I agree, this is maybe premature, but Weaver does a lot of things that could eventually be packaged independantly to encourage contibutions. CWL engine is one thing, OGC-API processes implementation is another and the CLI could be a third. While Weaver could still use all of them, there are chances that, independantly, we could get external traction that is out of reach right now because of the service bundle.

fmigneault commented 2 years ago

There is already a draft for which I participated in owslib for OGC-API Processes implementation (though not yet merged...): https://github.com/geopython/OWSLib/pull/750/files

I would propose to define an abstract class with previously mentioned deploy, execute, monitor, etc. operations, which Weaver would provide an implementation. Given those, it should be relatively easy for birdy or others to instantiate the desired one. Technically, OGC-API Processes are also supposed to implement a /conformance endpoint where features/extensions such as deploy are listed. This could help in the selection of the appropriate implementation.