nextflow-io / nf-hack17

Nextflow hackathon 2017 projects
10 stars 2 forks source link

Project 11: AWS Batch integration #11

Open pditommaso opened 7 years ago

pditommaso commented 7 years ago

AWS Batch integration

Nextflow has an experimental support for AWS Batch. Goal of this project is to stabilise the current implementation, to add missing features and and to make it able to process real world pipelines.

Data:

(to be provided)

Computing resources:

(to be provided)

Project Lead:

Francesco Strozzi (@fstrozzi)

fstrozzi commented 6 years ago
tdudgeon commented 6 years ago

The key benefit of using AWS is that it provides a simple mechanism to run workflows in an environment that autoscales. At the extreme you can have zero permanently running instances (and almost zero cost) but start up instances on demand when there are jobs to execute. And still have highly elastic compute capabilities that can handle large jobs. Also have good control over limiting costs (including using spot instances).

The obvious downside is that you are tied to using AWS. The autoscaling capabilities can in principle be handled in other systems, but it is relatively hard work to do so.

We went over the overall process for executing Nextflow jobs on AWS batch. The code is currently on the aws=batch branch. In doing so we identified a significant impediment as it was necessary to have the AWS command line tools present in every Docker container that was being used. Managing alternative versions of lots of Docker images is pretty well a non-starter for most organisations (especially as these tools have complex dependencies such as Python), so we looked for alternative solution to this.

The reason the AWS CLI is needed is:

  1. to copy the bash script to execute for each NF process from S3 to the container so that it can be executed aws s3 ...
  2. to copy any necessary input files and result files to S3

The approach identified involves installing the AWS command line tools on the AWS image that is used as the Docker host machine, and to mount the directory that contains the necessary items into each of the Docker container as a volume. This is a bit hacky, and needs checking for how portable it is, but initial testing proves that it works.

If using this approach you will need to specify the aws_cli parameter as part of the batch executor definition to point to the location at which these CLI goodies are located. If this is not specified then it is assumed that the aws is on the PATH of the container.

Current Status

Code and docs are on the aws-batch branch and has had some preliminary testing to prove the approach works. Further testing is required.

Alternative approaches

Other approaches can be considered if there are suggestions.

One suggestion was to use a dedicated Docker image that contains the AWS CLI as a 'sidecar' image to do the copying, but it wasn't clear how this could work.

Another is to avoid use of S3 and provide the necessary files though mounted volumes (presumably this only works if NF is being executed from within AWS). In principle this seems possible but this needs to to tested in practice.