UltimaGenomics repository for workflows compatible with AWS HealthOmics
Ultima Genomics offers pipelines as Ready2Run workflows on AWS HealthOmics. Ready2Run workflows enable you to run these pipelines on AWS HealthOmics by simply bringing your data. For more flexibility such as the use of larger file sizes or changing the reference genome, you can convert Ready2Run workflows to private workflows by following the steps in this repository. Once the Ready2Run workflow is converted to a private workflow, the cost to run the workflow will now be based on the compute and run storage used during the private workflow.
Ultima Genomics also shares pipelines that has been modified to run as private workflows on AWS HealthOmics in this repository. You can follow the directions in this repository to create and run a private workflow on AWS HealthOmics.
Each workflow folder contains the following:
The instructions below include localizing resources, deploying workflow and creating a run.
For more questions about these workflows, please contact healtomics.support@ultimagen.com.
i. Pre requisites:
Pull and push the required public containers to your private ECR by following the steps:
a. Pull from docker hub or broad gcr into your local ecr
docker pull <hub_username>/<image_name>:<tag> #the docker as it appear on globals.wdl
docker tag <hub_username>/<image_name>:<tag> <your_aws_account_id>.dkr.ecr.<region>.amazonaws.com/<repository_name>:<tag>
aws ecr get-login-password --region <region> | docker login --username AWS --password-stdin <your_aws_account_id>.dkr.ecr.<region>.amazonaws.com
docker push <your_aws_account_id>.dkr.ecr.<region>.amazonaws.com/<repository_name>:<tag> #if repository doesn't exist, you will need to create it first
b. Grant AWS HealthOmics permission to access your private ECR by following the instructions here.
Import your input files into a S3 bucket.
Create an OmicsService role to access your resources by following the instructions here.
ii. Download the workflow folder as a zipped file, this should include main wdl file on the top level folder, tasks folders and
iii. Download locally the parameter template for your desired use case from input_templates folder.
iv. Modify and save the workflow scripts and parameter templates to meet your needs:
Once the workflow resources have been deployed into locally (see instructions per workflow), user can create private workflow on AWS HealthOmics
i. From the CLI:
$ aws omics create-workflow \
--name <workflow_name> \
--main <main_wdl_file> \ # in case there is more than one wdl file, the main one is the one named after the directory
--definition-zip fileb://<path_to_local_zip> \
--parameter-template file://<path_to_parameters_definition_json> \
--accelerators GPU
ii. From the console:
a. Click on **Private Workflows** from the left pane.
b. Click on **Create Workflow** on the Workflows list.
c. Follow the instructions on the console to create your workflow.
- Define "Main workflow definition file path" as <workflow_name>.wdl file
i. From the CLI:
$ aws omics start-run \
--workflow-id <workflow_id> \
--role-arn <service_role_arn> \
--output-uri <s3_uri_for_output_folder> \
--parameters file://<path_to_local_parameters_file> \
--name <run_name> \
--retention-mode REMOVE
ii. From the console (current omics versoin doesn't work well with wdl scoped parameters, cli is preferred):
a. Click Private Workflows from the left pane.
b. Click the Workflow ID from the Workflows list.
c. Click Create Run and enter the run information.
In case your private workflow's run failed, you can use this script to extract information and logs from AWS HealthOmics run to ease failures debugging. Please attach the tar file that generated by the script in any support call.