nf-core / ampliseq

Amplicon sequencing analysis workflow using DADA2 and QIIME2
https://nf-co.re/ampliseq
MIT License
188 stars 119 forks source link

docker cutadapt file not found #585

Closed scheckley closed 1 year ago

scheckley commented 1 year ago

Running with the Docker option on a folder of fastq files produces a file not found error during the cutadapt process. The pipeline completes without any errors when using the Singularity option. Python stack trace is enclosed below:

Pipeline completed with errors-
ERROR ~ Error executing process > 'NFCORE_AMPLISEQ:AMPLISEQ:CUTADAPT_WORKFLOW:CUTADAPT_BASIC (ANON00017)'

Caused by:
  Process `NFCORE_AMPLISEQ:AMPLISEQ:CUTADAPT_WORKFLOW:CUTADAPT_BASIC (ANON00017)` terminated with an error exit status (1)

Command executed:

  cutadapt \
      --cores 6 \
      --minimum-length 1 -O 3 -e 0.1 -g TCGTCGGCAGCGTCAGATGTGTATAAGAGACAGCCTACGGGNGGCWGCAG -G GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAGGACTACHVGGGTATCTAATCC --discard-untrimmed \
      -o ANON00017.trimmed_1.trim.fastq.gz -p ANON00017.trimmed_2.trim.fastq.gz \
      ANON00017_1.fastq.gz ANON00017_2.fastq.gz \
      > ANON00017.trimmed.cutadapt.log
  cat <<-END_VERSIONS > versions.yml
  "NFCORE_AMPLISEQ:AMPLISEQ:CUTADAPT_WORKFLOW:CUTADAPT_BASIC":
      cutadapt: $(cutadapt --version)
  END_VERSIONS

Command exit status:
  1

Command output:
  (empty)

Command error:
  Traceback (most recent call last):
    File "/usr/local/bin/cutadapt", line 10, in <module>
      sys.exit(main_cli())
    File "/usr/local/lib/python3.9/site-packages/cutadapt/__main__.py", line 848, in main_cli
      main(sys.argv[1:])
    File "/usr/local/lib/python3.9/site-packages/cutadapt/__main__.py", line 913, in main
      stats = r.run()
    File "/usr/local/lib/python3.9/site-packages/cutadapt/pipeline.py", line 825, in run
      raise e
  FileNotFoundError: [Errno 2] No such file or directory: 'ANON00017_1.fastq.gz'

Work dir:
  /home/test_dev/work/0a/ac598ed01bf496c321fea82ab65589

Not sure if this is docker configuration error, or a problem with the symbolic linking in rename_raw_data_files.nf not being visible from inside the container? The fastq files are located in a directory mounted as /data/Fastq/ on an Azure virtual machine which Nextflow is configured to read from in the --input option.

Thanks.

d4straub commented 1 year ago

Discussed also on slack, probably an issue with Azure virtual machine?

scheckley commented 1 year ago

Turned out to be an issue with Docker - symbolic links don't work for mounted storage, similar to this issue on stack overflow. Not related to Nextflow or ampliseq.