OrcaBus / service-icav2-wes-manager

Run analyses through ICAv2-backed WES interface
MIT License
0 stars 0 forks source link

Service ICAv2 WES Manager

Overview

Submit jobs / events via the WES API. We'll handle the rest of the 'icav2' drama for you!

Events Overview

Essentially we handle ICAv2 requests on an internal event bus but retrieve requests from the external event bus for the WES API.

We also send back 'important' state change events to the external event bus.

Status Enum

ICAv2 state change events comprise the following list of statuses:

Click to expand * REQUESTED * QUEUED * INITIALIZING * PREPARING_INPUTS * IN_PROGRESS * GENERATING_OUTPUTS * AWAITING_INPUT * ABORTING * SUCCEEDED * FAILED * FAILED_FINAL * ABORTED

This is a lot and floods our external event bus. We trim this down and map these to the equivalent states in AWS BATCH Although we keep the 'ABORTED' status as is.

Click to expand * SUBMITTED: On post request from the WES API * PENDING: In the WES API Queue (:construction: Not yet implemented, will be added in the future when we add in the queue system) * RUNNABLE: Step Function to run the analysis has been triggered. * STARTING: Event from ICAv2 parsed through, the process has been registered on ICAv2 * (renamed from INITIALIZING) * RUNNING (renamed from IN_PROGRESS) * SUCCEEDED: The analysis has completed successfully. * FAILED: The analysis has failed. * ABORTED: The analysis has been aborted.

Events Overview

WES State Change Requests

Click to expand! ```json5 { "DetailType": "Icav2WesAnalysisStateChange", "source": "orcabus.icav2wesmanager", "account": "843407916570", "time": "2025-05-28T03:54:35Z", "region": "ap-southeast-2", "resources": [], "detail": { "id": "iwa.01JWAGE5PWS5JN48VWNPYSTJRN", "name": "bclconvert-interop-qc", "inputs": { "bclconvert_report_directory": { "class": "Directory", "location": "s3://pipeline-dev-cache-503977275616-ap-southeast-2/byob-icav2/development/primary/20231010_pi1-07_0329_A222N7LTD3/202504179cac7411/Reports/" }, "interop_directory": { "class": "Directory", "location": "s3://pipeline-dev-cache-503977275616-ap-southeast-2/byob-icav2/development/primary/20231010_pi1-07_0329_A222N7LTD3/202504179cac7411/InterOp/" }, "instrument_run_id": "20231010_pi1-07_0329_A222N7LTD3" }, "engineParameters": { "pipelineId": "55a8bb47-d32b-48dd-9eac-373fd487ccec", "projectId": "ea19a3f5-ec7c-4940-a474-c31cd91dbad4", "outputUri": "s3://pipeline-dev-cache-503977275616-ap-southeast-2/byob-icav2/development/test_data/bclconvert-interop-qc-test/", "logsUri": "s3://pipeline-dev-cache-503977275616-ap-southeast-2/byob-icav2/development/test_data/logs/bclconvert-interop-qc-test/" }, "tags": { "instrument_run_id": "20231010_pi1-07_0329_A222N7LTD3" }, "status": "SUBMITTED", "submissionTime": "2025-05-28T03:54:35.612655", "stepsLaunchExecutionArn": "arn:aws:states:ap-southeast-2:843407916570:execution:icav2-wes-launchIcav2Analysis:3f176fc2-d8e0-4bd5-8d2f-f625d16f6bf6", "icav2AnalysisId": null, "startTime": "2025-05-28T03:54:35.662401+00:00", "endTime": null } } ``` Once an analysis has launched on ICAv2, we will forward sqs events in the ICAv2 WES analysis status changes enum list. We will also populate the analysis id in the `icav2AnalysisId` field once the analysis has been launched on ICAv2.

WES API Overview

We support the following endpoints

GET

POST

PATCH

WES POST

The WES POST endpoint is used to submit a new analysis job.

The request body should contain the following keys:

Click to expand ```json5 { // The unique analysis name "name": "bclconvert-interop-qc", // The inputs to the analysis "inputs": { "bclconvert_report_directory": { "class": "Directory", "location": "s3://pipeline-dev-cache-503977275616-ap-southeast-2/byob-icav2/development/primary/20231010_pi1-07_0329_A222N7LTD3/202504179cac7411/Reports/" }, "interop_directory": { "class": "Directory", "location": "s3://pipeline-dev-cache-503977275616-ap-southeast-2/byob-icav2/development/primary/20231010_pi1-07_0329_A222N7LTD3/202504179cac7411/InterOp/" }, "instrument_run_id": "20231010_pi1-07_0329_A222N7LTD3" }, // The engine parameters for the analysis "engineParameters": { // The ICAv2 pipeline id to run "pipelineId": "55a8bb47-d32b-48dd-9eac-373fd487ccec", // The ICAv2 project id to run the analysis in "projectId": "ea19a3f5-ec7c-4940-a474-c31cd91dbad4", // The output location to store the results of the analysis "outputUri": "s3://pipeline-dev-cache-503977275616-ap-southeast-2/byob-icav2/development/test_data/bclconvert-interop-qc-test/", // The location to store the logs of the analysis "logsUri": "s3://pipeline-dev-cache-503977275616-ap-southeast-2/byob-icav2/development/test_data/logs/bclconvert-interop-qc-test/" }, // Any tags to add to the analysis job (helpful for finding the analysis job later) "tags": { "instrument_run_id": "20231010_pi1-07_0329_A222N7LTD3" }, "status": "SUBMITTED" } ```

To run this over the WES API, you can use the following curl command:

curl \
  --silent --show-error --location --fail \
  --request "POST" \
  --header "Accept: application/json" \
  --header "Authorization: Bearer ${ORCABUS_TOKEN}" \
  --header "Content-Type: application/json" \
  --data "$( \
    jq --raw-output \
      '
        {
          "name": "bclconvert-interop-qc--20231010_pi1-07_0329_A222N7LTD3",
          "inputs": {
            "bclconvert_report_directory": {
              "class": "Directory",
              "location": "s3://pipeline-dev-cache-503977275616-ap-southeast-2/byob-icav2/development/primary/20231010_pi1-07_0329_A222N7LTD3/202504179cac7411/Reports/"
            },
            "interop_directory": {
              "class": "Directory",
              "location": "s3://pipeline-dev-cache-503977275616-ap-southeast-2/byob-icav2/development/primary/20231010_pi1-07_0329_A222N7LTD3/202504179cac7411/InterOp/"
            },
            "instrument_run_id": "20231010_pi1-07_0329_A222N7LTD3"
          },
          "engineParameters": {
            "pipelineId": "55a8bb47-d32b-48dd-9eac-373fd487ccec",
            "projectId": "ea19a3f5-ec7c-4940-a474-c31cd91dbad4",
            "outputUri": "s3://pipeline-dev-cache-503977275616-ap-southeast-2/byob-icav2/development/test_data/bclconvert-interop-qc-test/",
            "logsUri": "s3://pipeline-dev-cache-503977275616-ap-southeast-2/byob-icav2/development/test_data/logs/bclconvert-interop-qc-test/"
          },
          "tags": {
            "instrument_run_id": "20231010_pi1-07_0329_A222N7LTD3"
          }
        }
      ' \
  )" \
  --url "https://icav2-wes.dev.umccr.org/api/v1/analysis/"

You can also use an event bus to submit the analysis job, which will be handled by the WES API.

{
  "EventBusName": "OrcaBusMain",
  "DetailType": "Icav2WesAnalysisRequest",
  "Source": "your source",
  "Detail": {
    // The same as the POST request body above as a json body
  }
}

WES GET

Get requests contain the same information as a POST request but with the following additional keys.

To keep compatibility with both CWL AND Nextflow, we do not use output jsons as available in CWL, instead we expect all data and metadata to be available in the analysis job output location.

You can retrieve the analysis job by name or id.

You can also retrieve all analyses jobs by using the GET /api/v1/analyses/ endpoint.

By Name

Click to expand ```bash curl \ --silent --show-error --location --fail \ --request "GET" \ --header "Accept: application/json" \ --header "Authorization: Bearer ${ORCABUS_TOKEN}" \ --url "https://icav2-wes.dev.umccr.org/api/v1/analysis?name=bclconvert-interop-qc--20231010_pi1-07_0329_A222N7LTD3" ``` Will retrieve the following response in pagination format ```json { "links": { "previous": null, "next": null }, "pagination": { "page": 1, "rowsPerPage": 100, "count": 1 }, "results": [ { "id": "iwa.01JWAGE5PWS5JN48VWNPYSTJRN", "name": "bclconvert-interop-qc--20231010_pi1-07_0329_A222N7LTD3", "inputs": { "bclconvert_report_directory": { "class": "Directory", "location": "s3://pipeline-dev-cache-503977275616-ap-southeast-2/byob-icav2/development/primary/20231010_pi1-07_0329_A222N7LTD3/202504179cac7411/Reports/" }, "instrument_run_id": "20231010_pi1-07_0329_A222N7LTD3", "interop_directory": { "class": "Directory", "location": "s3://pipeline-dev-cache-503977275616-ap-southeast-2/byob-icav2/development/primary/20231010_pi1-07_0329_A222N7LTD3/202504179cac7411/InterOp/" } }, "engineParameters": { "pipelineId": "55a8bb47-d32b-48dd-9eac-373fd487ccec", "projectId": "ea19a3f5-ec7c-4940-a474-c31cd91dbad4", "outputUri": "s3://pipeline-dev-cache-503977275616-ap-southeast-2/byob-icav2/development/test_data/bclconvert-interop-qc-test/", "logsUri": "s3://pipeline-dev-cache-503977275616-ap-southeast-2/byob-icav2/development/test_data/logs/bclconvert-interop-qc-test/" }, "tags": { "instrument_run_id": "20231010_pi1-07_0329_A222N7LTD3" }, "status": "SUCCEEDED", "submissionTime": "2025-05-28T03:54:35.612655", "stepsLaunchExecutionArn": "arn:aws:states:ap-southeast-2:843407916570:execution:icav2-wes-launchIcav2Analysis:3f176fc2-d8e0-4bd5-8d2f-f625d16f6bf6", "icav2AnalysisId": "b7157552-74a1-4ff4-a6b3-b37a85a485cf", "startTime": "2025-05-28T03:54:35.662401Z", "endTime": "2025-05-28T04:32:26.456422Z" } ] } ```

By Id

Alternatively, you can retrieve the analysis job by id by appending the id to the endpoint.

Click to expand! ```shell curl \ --silent --show-error --location --fail \ --request "GET" \ --header "Accept: application/json" \ --header "Authorization: Bearer ${ORCABUS_TOKEN}" \ --url "https://icav2-wes.dev.umccr.org/api/v1/analysis/iwa.01JWAGE5PWS5JN48VWNPYSTJRN" ``` Which will return the same response as above, but without the pagination links. ```json { "id": "iwa.01JWAGE5PWS5JN48VWNPYSTJRN", "name": "bclconvert-interop-qc--20231010_pi1-07_0329_A222N7LTD3", "inputs": { "bclconvert_report_directory": { "class": "Directory", "location": "s3://pipeline-dev-cache-503977275616-ap-southeast-2/byob-icav2/development/primary/20231010_pi1-07_0329_A222N7LTD3/202504179cac7411/Reports/" }, "instrument_run_id": "20231010_pi1-07_0329_A222N7LTD3", "interop_directory": { "class": "Directory", "location": "s3://pipeline-dev-cache-503977275616-ap-southeast-2/byob-icav2/development/primary/20231010_pi1-07_0329_A222N7LTD3/202504179cac7411/InterOp/" } }, "engineParameters": { "pipelineId": "55a8bb47-d32b-48dd-9eac-373fd487ccec", "projectId": "ea19a3f5-ec7c-4940-a474-c31cd91dbad4", "outputUri": "s3://pipeline-dev-cache-503977275616-ap-southeast-2/byob-icav2/development/test_data/bclconvert-interop-qc-test/", "logsUri": "s3://pipeline-dev-cache-503977275616-ap-southeast-2/byob-icav2/development/test_data/logs/bclconvert-interop-qc-test/" }, "tags": { "instrument_run_id": "20231010_pi1-07_0329_A222N7LTD3" }, "status": "SUCCEEDED", "submissionTime": "2025-05-28T03:54:35.612655", "stepsLaunchExecutionArn": "arn:aws:states:ap-southeast-2:843407916570:execution:icav2-wes-launchIcav2Analysis:3f176fc2-d8e0-4bd5-8d2f-f625d16f6bf6", "icav2AnalysisId": "b7157552-74a1-4ff4-a6b3-b37a85a485cf", "startTime": "2025-05-28T03:54:35.662401Z", "endTime": "2025-05-28T04:32:26.456422Z" } ```

Step Functions Overview

Submit job to ICA

This will become more complex as we add queueing and priority support to the WES API.

Along with JSON-schema validation steps and pre-bundling support.

Submit job to ICA

Retrieve ICAv2 Analysis State Change Event

This step function is triggered by an ICAv2 analysis state change event.

Retrieve ICAv2 Analysis State Change Event

Abort ICAv2 Analysis

Handles an abort request from the WES API.

While this step function is just a simple lambda, by placing in an SFN like this, we can set retries to 60 seconds and a max of 5 attempts.

Abort ICAv2 Analysis

:construction: EVERYTHING BELOW HERE :construction:

Project Structure

The project is organized into the following key directories:

Setup

Requirements

node --version
v22.9.0

# Update Corepack (if necessary, as per pnpm documentation)
npm install --global corepack@latest

# Enable Corepack to use pnpm
corepack enable pnpm

Install Dependencies

To install all required dependencies, run:

make install

CDK Commands

You can access CDK commands using the pnpm wrapper script.

This template provides two types of CDK entry points: cdk-stateless and cdk-stateful.

The type of stack to deploy is determined by the context set in the ./bin/deploy.ts file. This ensures the correct stack is executed based on the provided context.

For example:

# Deploy a stateless stack
pnpm cdk-stateless deploy OrcaBusStatelessICAv2WesStack/Icav2WesManagerStatelessDeploymentPipeline/OrcaBusBeta/Icav2WesManagerStatelessDeployStack

# Deploy a stateful stack
pnpm cdk-stateful deploy OrcabusStatefulICAv2WesStack/Icav2WesManagerStatefulDeployPipeline/OrcaBusBeta/Icav2WesManagerStatefulDeployStack

Stacks

This CDK project manages multiple stacks. The root stack (the only one that does not include DeploymentPipeline in its stack ID) is deployed in the toolchain account and sets up a CodePipeline for cross-environment deployments to beta, gamma, and prod.

To list all available stacks, run:

pnpm cdk-stateless ls

Example output:

OrcaBusStatelessICAv2WesStack
OrcaBusStatelessICAv2WesStack/DeploymentPipeline/OrcaBusBeta/DeployStack (OrcaBusBeta-DeployStack)
OrcaBusStatelessICAv2WesStack/DeploymentPipeline/OrcaBusGamma/DeployStack (OrcaBusGamma-DeployStack)
OrcaBusStatelessICAv2WesStack/DeploymentPipeline/OrcaBusProd/DeployStack (OrcaBusProd-DeployStack)

Linting and Formatting

Run Checks

To run linting and formatting checks on the root project, use:

make check

Fix Issues

To automatically fix issues with ESLint and Prettier, run:

make fix

Road map :construction:

Scheduling support

Support the following enums

Data-to-compute support

We also may look at storage Credentials options, in a later release of wrapica.

JSON Schema validation support

Pipeline endpoint support