SciLifeLab / Sarek

Detect germline or somatic variants from normal or tumour/normal whole-genome or targeted sequencing
https://nf-co.re/sarek
MIT License
132 stars 6 forks source link

containerOverrides for AWS BATCH nextflow #804

Closed Dharmendra-G-1 closed 4 years ago

Dharmendra-G-1 commented 5 years ago

Hello, I am trying to run Sarek from files stored at AWS s3 using AWS Sagemaker. First I am starting with building a reference as I work on the nonhuman reference.

containerOverrides={
        'command': [
            "s3://{0}/{1}".format(workflowBucket, workflowFolderPrefix),
            "./build.nf",
            "--refDir", "s3://reference/Sus_scrofa.Sscrofa11.1.dna.toplevel_with_PL10R.fa",
            "--outDir", "s3://reference/genome_indexed/"
        ]
    }

I see it launched with above command, I get the following error, can anyone tell me what am I missing to get only build.nf launch while avoiding annotate.nf, germlineVC.nf,main.nf, runMultiQC.nf and somaticVC.nf ?

Waiting for head job to start...
Head job is running...
s3://nextflowdata/scripts ./build.nf --refDir s3://reference/Sus_scrofa.Sscrofa11.1.dna.toplevel_with_PL10R.fa --outDir s3://reference/genome_indexed/
Transitioning to Nextflow
nextflow run ./annotate.nf
./build.nf
./germlineVC.nf
./main.nf
./runMultiQC.nf
./somaticVC.nf ./build.nf --refDir s3://reference/Sus_scrofa.Sscrofa11.1.dna.toplevel_with_PL10R.fa --outDir s3://reference/genome_indexed/
N E X T F L O W  ~  version 19.04.0
Launching `./annotate.nf` [desperate_golick] - revision: ef015b173c
ERROR ~ Access Denied (Service: Amazon S3; Status Code: 403; Error Code: AccessDenied; Request ID: 5844A0D0CBDA9A83; S3 Extended Request ID: O0glINPc98XxNCms76f7CYzbJaudF2x50Wi/Ixam9dV+qoKcKZh+2PaOJ4Jt5WcPK7D4CyYsNv0=)
 -- Check '.nextflow.log' file for details
Head job FAILED

another trial with a directory which contains the reference fasta files:

Waiting for head job to start...
Head job is running...
s3://nextflowdata/scripts ./build.nf --refDir s3://reference_for_sarek/Sus11.1v95_plus_PL9 --outDir s3://reference_for_sarek/Sus11.1v95_plus_PL9/genome_indexed
Transitioning to Nextflow
nextflow run ./annotate.nf
./build.nf
./germlineVC.nf
./main.nf
./runMultiQC.nf
./somaticVC.nf ./build.nf --refDir s3://reference_for_sarek/Sus11.1v95_plus_PL9 --outDir s3://reference_for_sarek/Sus11.1v95_plus_PL9/genome_indexed
N E X T F L O W  ~  version 19.04.0
Launching `./annotate.nf` [kickass_swartz] - revision: ef015b173c
ERROR ~ Access Denied (Service: Amazon S3; Status Code: 403; Error Code: AccessDenied; Request ID: C735A6E6BFF21292; S3 Extended Request ID: POCd1B30d3NVGJ8KwZeXqJKsAKzYs1/3CEzaJntuR2xsLuk1ZEWpIonFJYUpG2T3ZfkFQRhjccw=)
 -- Check '.nextflow.log' file for details
Head job FAILED

Please let us know what will be the best way to send containerOverrides using AWSBATCH on Sarek nextflow.

Thanks,

With Regards, Dharm

maxulysse commented 5 years ago

I'm afraid I have never use the containerOverrides, or Sagemaker when playing with AWS

I'm invoking the AWS experts, our own A-Team @alneberg @apeltzer @KochTobi

Have you look at @apeltzer blog post about running pipelines on AWS: https://apeltzer.github.io/post/01-aws-nfcore/

On a separate note, we're currently porting Sarek to nf-core, and putting all the scripts into one, so it'll be easier to use as well.

Dharmendra-G-1 commented 5 years ago

It is great to hear that you are importing Sarek to nf-core. I did look into that the blog by @apeltzer but noting is mention on AWS Batch containerOverrides, it is in troubleshooting mode. I believe half of the blog problems can be solved by using AWS Sagemaker to provide the parameters. If you or @apeltzer any guidelines on passing parameters via containerOverride on AWS Batch please let us know. Thanks.

maxulysse commented 5 years ago

Quick question, have you tried making a new profile with specific configuration files related to your specific genome? It seems to me it might be easier to change that at this level than with containerOverride

maxulysse commented 4 years ago

Closing due to moving to nf-core/sarek. If you still have the issue @Dharmendra-G-1 can you please open a new one on the nf-core repo? All the best, Maxime