nf-core / fetchngs

Pipeline to fetch metadata and raw FastQ files from public databases
https://nf-co.re/fetchngs
MIT License
150 stars 72 forks source link

A DataflowVariable can only be assigned once #141

Closed artur-matysik closed 1 year ago

artur-matysik commented 1 year ago

Description of the bug

Hi,

I experienced weird problem when trying to download the dataset. Pipeline was going OK, but gets kicked out at some point with error

A DataflowVariable can only be assigned once. Use bind() to allow for equal values to be passed into already-bound variables.

Same is happening after -resume. What might be the problem here? I used --force_sratools_download true as only SRA is available

Command used and terminal output

# Command
nextflow run ~/pipelines/fetchngs/main.nf \
--input s3://<bucket>/PRJDB4176_Yachida_NatMed_2019/SRR_Acc_List.txt \
--outdir s3://<bucket>/PRJDB4176_Yachida_NatMed_2019/ \
--force_sratools_download true \
-profile docker \
-work-dir s3://<bucket>/work/ \
-c awsbatch.config

# Config
aws.region          = 'ap-southeast-1'
aws.batch.cliPath   = '/home/ec2-user/miniconda/bin/aws'
process.executor    = 'awsbatch'
process.queue       = '<queue_name>'

params {
    nf_core_pipeline        = 'taxprofiler'

    // Max
    max_cpus   = 12
    max_memory = '47.GB'
    max_time   = '10.h'
}

process {
  withName: SRA_IDS_TO_RUNINFO {
    errorStrategy = 'ignore'
  }
}

# Output
[bf/d23b8b] process > NFCORE_FETCHNGS:SRA:SRA_IDS_TO_RUNINFO (DRR162776)                                                           [100%] 80 of 80, cached: 80 ✔
[d5/48f4ea] process > NFCORE_FETCHNGS:SRA:SRA_RUNINFO_TO_FTP (80)                                                                  [100%] 80 of 80, cached: 80 ✔
[-        ] process > NFCORE_FETCHNGS:SRA:SRA_FASTQ_FTP                                                                            -
[9b/3a9b61] process > NFCORE_FETCHNGS:SRA:FASTQ_DOWNLOAD_PREFETCH_FASTERQDUMP_SRATOOLS:CUSTOM_SRATOOLSNCBISETTINGS (ncbi-settings) [100%] 1 of 1, cached: 1 ✔
[dd/fb9afb] process > NFCORE_FETCHNGS:SRA:FASTQ_DOWNLOAD_PREFETCH_FASTERQDUMP_SRATOOLS:SRATOOLS_PREFETCH (DRR162776)               [100%] 80 of 80, cached: 80 ✔
[de/281c46] process > NFCORE_FETCHNGS:SRA:FASTQ_DOWNLOAD_PREFETCH_FASTERQDUMP_SRATOOLS:SRATOOLS_FASTERQDUMP (DRX153395_DRR162776)  [100%] 58 of 58, cached: 27, failed: 1
[ff/675871] process > NFCORE_FETCHNGS:SRA:SRA_TO_SAMPLESHEET (DRX120436_DRR127692)                                                 [100%] 57 of 57, cached: 27
[-        ] process > NFCORE_FETCHNGS:SRA:SRA_MERGE_SAMPLESHEET                                                                    -
[-        ] process > NFCORE_FETCHNGS:SRA:MULTIQC_MAPPINGS_CONFIG                                                                  -
[-        ] process > NFCORE_FETCHNGS:SRA:CUSTOM_DUMPSOFTWAREVERSIONS                                                              -
Execution cancelled -- Finishing pending tasks before exit
WARN: Unable to get file attributes file: s3://bm-ks-nextflow-workdir/Art/2023-04-06_PRJDB4176/work/c0/3cfaf4f3e6178473eafb2345965a26/versions.yml -- Cause: com.amazonaws.AbortedException:
WARN: Unable to get file attributes file: s3://bm-ks-nextflow-workdir/Art/2023-04-06_PRJDB4176/work/9b/3a9b61d58df2eb3a2ae563006c13f3/versions.yml -- Cause: com.amazonaws.AbortedException:
WARN: Unable to get file attributes file: s3://bm-ks-nextflow-workdir/Art/2023-04-06_PRJDB4176/work/ab/1ad01b84ab8707e25d3d1630fb4a28/versions.yml -- Cause: com.amazonaws.AbortedException:
WARN: Unable to get file attributes file: s3://bm-ks-nextflow-workdir/Art/2023-04-06_PRJDB4176/work/c0/3cfaf4f3e6178473eafb2345965a26/versions.yml -- Cause: com.amazonaws.AbortedException:
WARN: Got an interrupted exception while taking agent result | java.lang.InterruptedException
Error executing process > 'NFCORE_FETCHNGS:SRA:FASTQ_DOWNLOAD_PREFETCH_FASTERQDUMP_SRATOOLS:SRATOOLS_FASTERQDUMP (DRX120481_DRR127737)'

Caused by:
  Task failed to start - DockerTimeoutError: Could not transition to created; timed out after waiting 4m0s

Command executed:

  export NCBI_SETTINGS="$PWD/user-settings.mkfg"

  fasterq-dump \
       \
      --threads 6 \
      --outfile DRX120481_DRR127737 \
      DRR127737

  pigz \
       \
      --no-name \
      --processes 6 \
      *.fastq

  cat <<-END_VERSIONS > versions.yml
  "NFCORE_FETCHNGS:SRA:FASTQ_DOWNLOAD_PREFETCH_FASTERQDUMP_SRATOOLS:SRATOOLS_FASTERQDUMP":
      sratools: $(fasterq-dump --version 2>&1 | grep -Eo '[0-9.]+')
      pigz: $( pigz --version 2>&1 | sed 's/pigz //g' )
  END_VERSIONS

Command exit status:
  -

Command output:
  (empty)

Work dir:
  s3://bm-ks-nextflow-workdir/Art/2023-04-06_PRJDB4176/work/d8/07dca0e134f49630ced4795526a976

Tip: view the complete command output by changing to the process work dir and entering the command `cat .command.out`

Unexpected error [AbortedException]

A DataflowVariable can only be assigned once. Use bind() to allow for equal values to be passed into already-bound variables.

[AWS BATCH] Waiting jobs reaper to complete (7 jobs to be terminated)

Relevant files

nextflow.log

System information

artur-matysik commented 1 year ago

I ran the problematic samples alone and they went through with no problem, so it looks like its not really sample- or pipeline- specific problem but rather AWS Batch (I/O?).

For anyone dealing with similar issue: I increased mounted storage with launch template (to 1000GB, gp3) and after resuming the pipeline it completed.

Closing the issue for now.