Closed stevencox closed 6 years ago
This is swagger for the CWL version.
This is swagger for the new GA4GH version.
Diffing them creates lots of output, as one would expect. But much of the output is trivial:
and things like that.
I'm pasting the diff here for reference but diffmerge for mac is one way to visualize the changes very helpfully.
2,3c2
< "swagger": "2.0",
< "basePath": "/",
---
> "basePath": "/", "swagger": "2.0",
5c4
< "title": "workflow_execution.proto",
---
> "title": "workflow_execution_service.proto",
21c20,21
< "summary": "Get information about Workflow Execution Service. May include information related (but not limited to) the workflow descriptor formats, versions supported, the WES API versions supported, and information about general the service availability.",
---
> "summary": "Get information about Workflow Execution Service. May include information related (but\nnot limited to) the workflow descriptor formats, versions supported, the WES API versions supported, and information about general the service availability.",
> "x-swagger-router-controller": "ga4gh.wes.server",
27c27
< "$ref": "#/definitions/ga4gh_wes_service_info"
---
> "$ref": "#/definitions/ga4gh_wes_ServiceInfo"
38c38,39
< "summary": "List the workflows, this endpoint will list the workflows in order of oldest to newest. There is no guarantee of live updates as the user traverses the pages, the behavior should be decided (and documented) by each implementation.",
---
> "summary": "List the workflows, this endpoint will list the workflows in order of oldest to newest.\nThere is no guarantee of live updates as the user traverses the pages, the behavior should be\ndecided (and documented) by each implementation.\nTo monitor a given execution, use GetWorkflowStatus or GetWorkflowLog.",
> "x-swagger-router-controller": "ga4gh.wes.server",
44c45
< "$ref": "#/definitions/ga4gh_wes_workflow_list_response"
---
> "$ref": "#/definitions/ga4gh_wes_WorkflowListResponse"
51c52
< "description": "OPTIONAL\nNumber of workflows to return at once. Defaults to 256, and max is 2048.",
---
> "description": "OPTIONAL\nNumber of workflows to return in a page.",
59c60
< "description": "OPTIONAL\nToken to use to indicate where to start getting results. If unspecified, returns the first page of results.",
---
> "description": "OPTIONAL\nToken to use to indicate where to start getting results. If unspecified, returns the first\npage of results.",
65,66c66,67
< "name": "key_value_search",
< "description": "OPTIONAL\nFor each key, if the key's value is empty string then match workflows that are tagged with this key regardless of value.",
---
> "name": "tag_search",
> "description": "OPTIONAL\nFor each key, if the key's value is empty string then match workflows that are tagged with\nthis key regardless of value.",
77c78,79
< "summary": "Run a workflow, this endpoint will allow you to create a new workflow request and retrieve its tracking ID to monitor its progress. An important assumption in this endpoint is that the workflow_params JSON will include parameterizations along with input and output files. The latter two may be on S3, Google object storage, local filesystems, etc. This specification makes no distinction. However, it is assumed that the submitter is using URLs that this system both understands and can access. For Amazon S3, this could be accomplished by given the credentials associated with a WES service access to a particular bucket. The details are important for a production system and user on-boarding but outside the scope of this spec.",
---
> "summary": "Run a workflow, this endpoint will allow you to create a new workflow request and\nretrieve its tracking ID to monitor its progress. An important assumption in this\nendpoint is that the workflow_params JSON will include parameterizations along with\ninput and output files. The latter two may be on S3, Google object storage, local filesystems,\netc. This specification makes no distinction. However, it is assumed that the submitter\nis using URLs that this system both understands and can access. For Amazon S3, this could\nbe accomplished by given the credentials associated with a WES service access to a\nparticular bucket. The details are important for a production system and user on-boarding\nbut outside the scope of this spec.",
> "x-swagger-router-controller": "ga4gh.wes.server",
83c85
< "$ref": "#/definitions/ga4gh_wes_workflow_run_id"
---
> "$ref": "#/definitions/ga4gh_wes_WorkflowRunId"
93c95
< "$ref": "#/definitions/ga4gh_wes_workflow_request"
---
> "$ref": "#/definitions/ga4gh_wes_WorkflowRequest"
104c106,107
< "summary": "Get detailed info about a running workflow",
---
> "summary": "Get detailed info about a running workflow.",
> "x-swagger-router-controller": "ga4gh.wes.server",
110c113
< "$ref": "#/definitions/ga4gh_wes_workflow_log"
---
> "$ref": "#/definitions/ga4gh_wes_WorkflowLog"
127c130,131
< "summary": "Cancel a running workflow",
---
> "summary": "Cancel a running workflow.",
> "x-swagger-router-controller": "ga4gh.wes.server",
133c137
< "$ref": "#/definitions/ga4gh_wes_workflow_run_id"
---
> "$ref": "#/definitions/ga4gh_wes_WorkflowRunId"
152c156,157
< "summary": "Get quick status info about a running workflow",
---
> "summary": "Get quick status info about a running workflow.",
> "x-swagger-router-controller": "ga4gh.wes.server",
158c163
< "$ref": "#/definitions/ga4gh_wes_workflow_status"
---
> "$ref": "#/definitions/ga4gh_wes_WorkflowStatus"
177c182,196
< "ga4gh_wes_log": {
---
> "ga4gh_wes_DefaultWorkflowEngineParameter": {
> "type": "object",
> "properties": {
> "type": {
> "type": "string",
> "description": "Describes the type of the parameter, e.g. float."
> },
> "default_value": {
> "type": "string",
> "description": "The stringified version of the default parameter. e.g. \"2.45\"."
> }
> },
> "description": "A message that allows one to describe default parameters for a workflow\nengine."
> },
> "ga4gh_wes_Log": {
191c210
< "startTime": {
---
> "start_time": {
195c214
< "endTime": {
---
> "end_time": {
207c226
< "exitCode": {
---
> "exit_code": {
215c234
< "ga4gh_wes_service_info": {
---
> "ga4gh_wes_ServiceInfo": {
221c240
< "$ref": "#/definitions/ga4gh_wes_workflow_type_version"
---
> "$ref": "#/definitions/ga4gh_wes_WorkflowTypeVersion"
223c242
< "title": "A map with keys as the workflow format type name (currently only CWL and WDL are used although a service may support others) and value is a workflow_type_version object which simply contains an array of one or more version strings"
---
> "title": "A map with keys as the workflow format type name (currently only CWL and WDL are used\nalthough a service may support others) and value is a workflow_type_version object which\nsimply contains an array of one or more version strings"
237c256
< "description": "The filesystem protocols supported by this service, currently these may include common protocols such as 'http', 'https', 'sftp', 's3', 'gs', 'file', 'synapse', or others as supported by this service."
---
> "description": "The filesystem protocols supported by this service, currently these may include common\nprotocols such as 'http', 'https', 'sftp', 's3', 'gs', 'file', 'synapse', or others as\nsupported by this service."
239c258
< "engine_versions": {
---
> "workflow_engine_versions": {
245a265,271
> "default_workflow_engine_parameters": {
> "type": "array",
> "items": {
> "$ref": "#/definitions/ga4gh_wes_DefaultWorkflowEngineParameter"
> },
> "description": "Each workflow engine can present additional parameters that can be sent to the\nworkflow engine. This message will list the default values, and their types for each\nworkflow engine."
> },
252c278,282
< "description": "The system statistics, key is the statistic, value is the count of workflows in that state. See the State enum for the possible keys."
---
> "description": "The system statistics, key is the statistic, value is the count of workflows in that state.\nSee the State enum for the possible keys."
> },
> "auth_instructions_url": {
> "type": "string",
> "description": "A URL that will help a in generating the tokens necessary to run a workflow using this\nservice."
254c284
< "key_values": {
---
> "tags": {
259c289
< "title": "a key-value map of arbitrary, extended metadata outside the scope of the above but useful to report back"
---
> "title": "A key-value map of arbitrary, extended metadata outside the scope of the above but useful\nto report back"
262,266c292
< "description": "."
< },
< "ga4gh_wes_service_info_request": {
< "type": "object",
< "title": "Blank request message for service request"
---
> "description": "A message containing useful information about the running service, including supported versions and\ndefault settings."
268c294
< "ga4gh_wes_state": {
---
> "ga4gh_wes_State": {
271,279c297,305
< "Unknown",
< "Queued",
< "Running",
< "Paused",
< "Complete",
< "Error",
< "SystemError",
< "Canceled",
< "Initializing"
---
> "UNKNOWN",
> "QUEUED",
> "INITIALIZING",
> "RUNNING",
> "PAUSED",
> "COMPLETE",
> "EXECUTOR_ERROR",
> "SYSTEM_ERROR",
> "CANCELED"
281,282c307,309
< "default": "Unknown",
< "title": "Enum for states"
---
> "default": "UNKNOWN",
> "description": "- UNKNOWN: The state of the task is unknown.\n\nThis provides a safe default for messages where this field is missing,\nfor example, so that a missing field does not accidentally imply that\nthe state is QUEUED.\n - QUEUED: The task is queued.\n - INITIALIZING: The task has been assigned to a worker and is currently preparing to run.\nFor example, the worker may be turning on, downloading input files, etc.\n - RUNNING: The task is running. Input files are downloaded and the first Executor\nhas been started.\n - PAUSED: The task is paused.\n\nAn implementation may have the ability to pause a task, but this is not required.\n - COMPLETE: The task has completed running. Executors have exited without error\nand output files have been successfully uploaded.\n - EXECUTOR_ERROR: The task encountered an error in one of the Executor processes. Generally,\nthis means that an Executor exited with a non-zero exit code.\n - SYSTEM_ERROR: The task was stopped due to a system error, but not from an Executor,\nfor example an upload failed due to network issues, the worker's ran out\nof disk space, etc.\n - CANCELED: The task was canceled by the user.",
> "title": "Enumeration of states for a given workflow request"
284c311
< "ga4gh_wes_workflow_desc": {
---
> "ga4gh_wes_WorkflowDesc": {
292c319
< "$ref": "#/definitions/ga4gh_wes_state",
---
> "$ref": "#/definitions/ga4gh_wes_State",
298,317c325
< "ga4gh_wes_workflow_list_request": {
< "type": "object",
< "properties": {
< "page_size": {
< "type": "integer",
< "format": "int64",
< "description": "OPTIONAL\nNumber of workflows to return at once. Defaults to 256, and max is 2048."
< },
< "page_token": {
< "type": "string",
< "description": "OPTIONAL\nToken to use to indicate where to start getting results. If unspecified, returns the first page of results."
< },
< "key_value_search": {
< "type": "string",
< "title": "OPTIONAL\nFor each key, if the key's value is empty string then match workflows that are tagged with this key regardless of value"
< }
< },
< "title": "Request listing of jobs tracked by server"
< },
< "ga4gh_wes_workflow_list_response": {
---
> "ga4gh_wes_WorkflowListResponse": {
323,324c331,333
< "$ref": "#/definitions/ga4gh_wes_workflow_desc"
< }
---
> "$ref": "#/definitions/ga4gh_wes_WorkflowDesc"
> },
> "description": "A list of workflows that the service has executed or is executing."
327c336,337
< "type": "string"
---
> "type": "string",
> "description": "A token, which when provided in a workflow_list_request, allows one to retrieve the next page\nof results."
330c340
< "title": "Return envelope for workflow listing"
---
> "description": "The service will return a workflow_list_response when receiving a successful workflow_list_request."
332c342
< "ga4gh_wes_workflow_log": {
---
> "ga4gh_wes_WorkflowLog": {
340,341c350,351
< "$ref": "#/definitions/ga4gh_wes_workflow_request",
< "title": "the original request object"
---
> "$ref": "#/definitions/ga4gh_wes_WorkflowRequest",
> "description": "The original request message used to initiate this execution."
344c354
< "$ref": "#/definitions/ga4gh_wes_state",
---
> "$ref": "#/definitions/ga4gh_wes_State",
348c358
< "$ref": "#/definitions/ga4gh_wes_log",
---
> "$ref": "#/definitions/ga4gh_wes_Log",
354c364
< "$ref": "#/definitions/ga4gh_wes_log"
---
> "$ref": "#/definitions/ga4gh_wes_Log"
364c374
< "ga4gh_wes_workflow_request": {
---
> "ga4gh_wes_WorkflowRequest": {
369c379
< "title": "OPTIONAL\nthe workflow CWL or WDL document, must provide either this or workflow_url"
---
> "description": "OPTIONAL\nThe workflow CWL or WDL document, must provide either this or workflow_url. By combining\nthis message with a workflow_type_version offered in ServiceInfo, one can initialize\nCWL, WDL, or a base64 encoded gzip of the required workflow descriptors. When files must be\ncreated in this way, the `workflow_url` should be set to the path of the main\nworkflow descriptor."
373c383
< "title": "REQUIRED\nthe workflow parameterization document (typically a JSON file), includes all parameterizations for the workflow including input and output file locations"
---
> "description": "REQUIRED\nThe workflow parameterization document (typically a JSON file), includes all parameterizations for the workflow\nincluding input and output file locations."
377c387
< "title": "REQUIRED\nthe workflow descriptor type, must be \"CWL\" or \"WDL\" currently (or another alternative supported by this WES instance)"
---
> "title": "REQUIRED\nThe workflow descriptor type, must be \"CWL\" or \"WDL\" currently (or another alternative supported by this WES instance)"
381c391,398
< "title": "REQUIRED\nthe workflow descriptor type version, must be one supported by this WES instance"
---
> "title": "REQUIRED\nThe workflow descriptor type version, must be one supported by this WES instance"
> },
> "tags": {
> "type": "object",
> "additionalProperties": {
> "type": "string"
> },
> "title": "OPTIONAL\nA key-value map of arbitrary metadata outside the scope of the workflow_params but useful to track with this workflow request"
383c400
< "key_values": {
---
> "workflow_engine_parameters": {
388c405
< "title": "OPTIONAL\na key-value map of arbitrary metadata outside the scope of the workflow_params but useful to track with this workflow request"
---
> "description": "OPTIONAL\nAdditional parameters can be sent to the workflow engine using this field. Default values\nfor these parameters are provided at the ServiceInfo endpoint."
392c409
< "title": "OPTIONAL\nthe workflow CWL or WDL document, must provide either this or workflow_descriptor"
---
> "description": "OPTIONAL\nThe workflow CWL or WDL document, must provide either this or workflow_descriptor. When a base64 encoded gzip of\nworkflow descriptor files is offered, the `workflow_url` should be set to the relative path\nof the main workflow descriptor."
395c412
< "title": "workflow request object"
---
> "description": "To execute a workflow, send a workflow request including all the details needed to begin downloading\nand executing a given workflow."
397c414
< "ga4gh_wes_workflow_run_id": {
---
> "ga4gh_wes_WorkflowRunId": {
406c423
< "ga4gh_wes_workflow_status": {
---
> "ga4gh_wes_WorkflowStatus": {
414c431
< "$ref": "#/definitions/ga4gh_wes_state",
---
> "$ref": "#/definitions/ga4gh_wes_State",
419c436
< "ga4gh_wes_workflow_type_version": {
---
> "ga4gh_wes_WorkflowTypeVersion": {
427c444
< "title": "an array of one or more version strings"
---
> "description": "an array of one or more acceptable types for the Workflow Type. For\nexample, to send a base64 encoded WDL gzip, one could would offer\n\"base64_wdl1.0_gzip\". By setting this value, and the path of the main WDL\nto be executed in the workflow_url to \"main.wdl\" in the WorkflowRequest."
430c447
< "title": "available workflow types supported by this WES"
---
> "description": "Available workflow types supported by a given instance of the service."
@stevencox See PR #2 for implementation details to upgrade to version 0.2.0.
While digging into the code I found that some optional items in the JSON spec file are not implemented.
auth_instructions_url
in ga4gh_wes_ServiceInfo
is empty.default_workflow_engine_parameters
in ga4gh_wes_ServiceInfo
is empty, as I don't have anything to include here.page_size
and page_token
when listing workflows are not used.--opt
Testing:
Using heliumdatacommons/datacommons-base:latest
run with a mounted local volume that contains the workflow-service
repo and add port 8080
to the port mappings.
pip install -e /<path>/dc-workflow-service/
wes-server --opt runner=cwltool &
wes-client
.Example commands:
curl http://localhost:8080/ga4gh/wes/v1/service-info
wes-client --proto=http --host=localhost:8080 tar.cwl tar-job.yml
wes-client --proto=http --host=localhost:8080 --list
wes-client --proto=http --host=localhost:8080 --get <workflow-id>
wes-client --proto=http --host=localhost:8080 --log <workflow-id>
The workflow tested is tar.cwl and the input file is tar-job.yml. The input file was modified to have the full file path, not just the file name. (ex. /dc-cwltool/examples/test0
not just test0
)
Evaluate the latest GA4GH WES specification, ask questions, and provide feedback. Also adapt functionality of existing Helium WES implementation to the new WES specification.
Helium is committed to supporting three GA4GH APIs in our MVP: TRS, WES, and DOS. We're nearing agreement on a plan where we evaluate the new spec for about a month, then commit to making only backward compatible changes to the spec. This is to ensure a base line level of cross Full Stack interoperability.