awsbatch profile does not run, cannot find aws cli

kgutwin commented 4 years ago

With a fresh install of Nextflow v20.10.0, I am trying to run the Sarek test pipeline using the awsbatch executor. My command line is:

nextflow run nfcore/sarek -profile test,awsbatch --awsqueue my-test-queue --awsregion us-east-1 -w s3://my-test-bucket/workdir --outdir s3://my-test-bucket/outdir

Nextflow is able to submit jobs to the Batch queue, but they all fail with the message

bash: /home/ec2-user/miniconda/bin/aws: No such file or directory
bash: line 1: /home/ec2-user/miniconda/bin/aws: No such file or directory

This is happening because Nextflow is specifying the following command to the container when it starts:

["bash","-o","pipefail","-c","trap \"{ ret=$?; /home/ec2-user/miniconda/bin/aws --region us-east-1 s3 cp --only-show-errors .command.log s3://my-test-bucket/workdir/fa/9bafeb75c35a243f6923cbdd26ead8/.command.log||true; exit $ret; }\" EXIT; /home/ec2-user/miniconda/bin/aws --region us-east-1 s3 cp --only-show-errors s3://my-test-bucket/workdir/fa/9bafeb75c35a243f6923cbdd26ead8/.command.run - | bash 2>&1 | tee .command.log"]

When I launch the nfcore/sarek:2.6.1 Docker container manually, I can see that the /home directory is empty, and the AWS CLI does not seem to be installed anywhere.

Should the AWS CLI be added to the list of packages installed by Conda? Or am I expected to build a custom container image including this tool? The Sarek documentation on AWS Batch implies a custom AMI, which doesn't seem to make sense in this case.

Thanks for your help!

FriederikeHanssen commented 4 years ago

Hi @kgutwin !

Sorry to hear you are having trouble submitting sarek on AWS. It has been a while since I ran sarek on aws like this (We now use the cloud formation scripts found here and submit from tower.nf).

Just to clarify: Are you logged into your EC2 instance? Which AMI are you using right now?

The last time we did it, we used this command:

nextflow run nf-core/sarek -profile docker -r 2.6.1 \
--outdir 's3://our-bucket/results_dir' \
-w 's3://our-bucket/workdir' \
--tracedir 's3://our-bucket/trace_' \
--input 's3://our-bucket/input.tsv' \
--genome 'GRCh38' \
--tools 'Strelka,ASCAT,snpEff' \
-c awsbatch.config \
--awsregion 'us-east-1' \
--igenomes_base 's3://our-bucket/references' \
--awscli '/home/ec2-user/miniconda/bin/aws' \
-resume

In AWSBatch.config contains the queue information and so on in our case:

process {
    queue = normal
    withName:MapReads {
    queue=highmem
    }
    withName:BamQC {
    queue = long
    }
}

process.executor = 'awsbatch'
aws.region = params.awsregion
executor.awscli = params.awscli

FYI: If you want to use igenomes later: The bucket is in availability zone eu-west. It looks like you want to run your analysis in us-east. You will need to copy the references to your availability zone, because as far as I know nextflow currently can't run with both.

For all I see, I think you may need to specify where your aws cli lives. The aws cli is not installed with sarek, but needs to be set up beforehand, when you set up your AMI. You can check whether it is there by running:

aws --version

To install it:

wget https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh
bash Miniconda3-latest-Linux-x86_64.sh -p /home/ec2-user/miniconda
/home/ec2-user/miniconda/bin/conda install -c conda-forge awscli
/home/ec2-user/miniconda/bin/aws --version

I am not 100% sure anymore, if afterwards you also need to run:

aws configure

and set up your credentials or not.

Does this solve it?

kgutwin commented 4 years ago

It looks like I missed the step of creating the custom AMI, which I discovered in the main Nextflow docs. I'll pursue that - thanks!

FriederikeHanssen commented 4 years ago

Sounds good. 👍 Shout if it doesn't work afterwards

XLuyu commented 3 years ago

I also encounter this problem. As you know, AWSBatch start instances (by AMI with AWS CLI), while instances start docker container(image: nfcore/sarek:2.6.1). The problem is this image doesn't include awscli to sync files.

A simple workaround is to build a new image based on nfcore/sarek:2.6.1 by installing awscli. However, it would be better to be installed in the official image. (don't forget nfcore/sarekvep)

FriederikeHanssen commented 3 years ago

Hi @XLuyu !

This would be something that is not just related to sarek but would be relevant to all nf-core pipelines. So maybe something worth discussing with the @nf-core/core .

In the meantime you could also try tower forge. It will set up the resources for you and you don't have to worry about aws cli anymore. In addition, you can easily supervise all your runs. See here: https://help.tower.nf/docs/compute-envs/aws-batch/

Sorry, you are having troubles with AWS Batch. Hopefully, this help a bit 🙂

nf-core / sarek

awsbatch profile does not run, cannot find aws cli #301