Process `NFCORE_NANOSEQ:NANOSEQ:QCFASTQ_NANOPLOT_FASTQC:NANOPLOT (23046_LEC_R1)` terminated with an error exit status (2)

gzentner commented 1 year ago

Description of the bug

I am attempting to run Nanoseq on some direct RNA-seq data with the following command, and get the ensuing error regarding Nanoplot. The rest of the pipeline seems to work fine if Nanoplot is skipped, but I'd really like to have that read length information. Thanks!

Command used and terminal output

`nextflow run nf-core/nanoseq --input samples.csv --protocol directRNA --skip_demultiplexing -profile docker --skip_fusion_analysis --max_memory 48GB --skip_modification_analysis --outdir output --skip_quantification`

Command executed:

  NanoPlot \
       \
      -t 2 \

  cat <<-END_VERSIONS > versions.yml
  "NFCORE_NANOSEQ:NANOSEQ:QCFASTQ_NANOPLOT_FASTQC:NANOPLOT":
      nanoplot: $(echo $(NanoPlot --version 2>&1) | sed 's/^.*NanoPlot //; s/ .*$//')
  END_VERSIONS

Command exit status:
  2

Command output:
  (empty)

Command error:
  usage: NanoPlot [-h] [-v] [-t THREADS] [--verbose] [--store] [--raw] [--huge]
                  [-o OUTDIR] [--no_static] [-p PREFIX] [--tsv_stats]
                  [--info_in_report] [--maxlength N] [--minlength N]
                  [--drop_outliers] [--downsample N] [--loglength]
                  [--percentqual] [--alength] [--minqual N] [--runtime_until N]
                  [--readtype {1D,2D,1D2}] [--barcoded] [--no_supplementary]
                  [-c COLOR] [-cm COLORMAP]
                  [-f [{png,jpg,jpeg,webp,svg,pdf,eps,json} ...]]
                  [--plots [{kde,hex,dot} ...]] [--legacy [{kde,dot,hex} ...]]
                  [--listcolors] [--listcolormaps] [--no-N50] [--N50]
                  [--title TITLE] [--font_scale FONT_SCALE] [--dpi DPI]
                  [--hide_stats]
                  (--fastq file [file ...] | --fasta file [file ...] | --fastq_rich file [file ...] | --fastq_minimal file [file ...] | --summary file [file ...] | --bam file [file ...] | --ubam file [file ...] | --cram file [file ...] | --pickle pickle | --feather file [file ...])
  NanoPlot: error: one of the arguments --fastq --fasta --fastq_rich --fastq_minimal --summary --bam --ubam --cram --pickle --feather is required

Work dir:
  /workdir/nanopore/work/19/fb7954da22118117bb6aafde8dd7df

Tip: you can replicate the issue by changing to the process work dir and entering the command `bash .command.run`

 -- Check '.nextflow.log' file for details

Relevant files

No response

System information

Nextflow version: 23.04.2 Hardware: AWS EC2 Executor: local Container engine: Docker Nanoseq version: 3.1.0

paulmenzel commented 1 year ago

A user is seeing the same issue. How were you able to see the exact commands executed? I had to go into the work directory, listed at the end, and look at command.sh:

$ more .command.sh
#!/bin/bash -euo pipefail
NanoPlot \
     \
    -t 2 \

cat <<-END_VERSIONS > versions.yml
"NFCORE_NANOSEQ:NANOSEQ:QCFASTQ_NANOPLOT_FASTQC:NANOPLOT":
    nanoplot: $(echo $(NanoPlot --version 2>&1) | sed 's/^.*NanoPlot //; s/ .*$//')
END_VERSIONS

The call at the top, NanoPlot -t 2, looks incomplete.

PS: For the logging, setting the environment variable NXF_DEBUG gives me a similar output to yours:

$ NXF_DEBUG=3 nextflow run nf-core/nanoseq …
[…]
Execution cancelled -- Finishing pending tasks before exit
ERROR ~ Error executing process > 'NFCORE_NANOSEQ:NANOSEQ:QCFASTQ_NANOPLOT_FASTQC:NANOPLOT (ONTDNAPIG001_R1)'

Caused by:
  Process `NFCORE_NANOSEQ:NANOSEQ:QCFASTQ_NANOPLOT_FASTQC:NANOPLOT (ONTDNAPIG001_R1)` terminated with an error exit status (2)

Command executed:

  NanoPlot \
       \
      -t 2 \

  cat <<-END_VERSIONS > versions.yml
  "NFCORE_NANOSEQ:NANOSEQ:QCFASTQ_NANOPLOT_FASTQC:NANOPLOT":
      nanoplot: $(echo $(NanoPlot --version 2>&1) | sed 's/^.*NanoPlot //; s/ .*$//')
  END_VERSIONS

Command exit status:
  2

Command output:
  (empty)

Command error:
  +++ for i in {1..7}
  +++ '[' 18043364 -lt 17895908 ']'
  +++ cpu_peak[i]=18043364
  +++ for i in {1..7}
  +++ '[' 145476 -lt 101036 ']'
  +++ cpu_peak[i]=145476
  +++ for i in {1..7}
  +++ '[' 6446 -lt 4502 ']'
  +++ cpu_peak[i]=6446
  +++ for i in {1..7}
  +++ '[' 274 -lt 272 ']'
  +++ cpu_peak[i]=274
  +++ '[' 3 = 1 ']'
  +++ nxf_stat_ret=(${sum[*]})
  +++ '[' 3 -lt 10 ']'
  +++ timeout=1
  +++ read -t 1 -r DONE
  usage: NanoPlot [-h] [-v] [-t THREADS] [--verbose] [--store] [--raw] [--huge]
                  [-o OUTDIR] [--no_static] [-p PREFIX] [--tsv_stats]
                  [--info_in_report] [--maxlength N] [--minlength N]
                  [--drop_outliers] [--downsample N] [--loglength]
                  [--percentqual] [--alength] [--minqual N] [--runtime_until N]
                  [--readtype {1D,2D,1D2}] [--barcoded] [--no_supplementary]
                  [-c COLOR] [-cm COLORMAP]
                  [-f [{png,jpg,jpeg,webp,svg,pdf,eps,json} ...]]
                  [--plots [{kde,hex,dot} ...]] [--legacy [{kde,dot,hex} ...]]
                  [--listcolors] [--listcolormaps] [--no-N50] [--N50]
                  [--title TITLE] [--font_scale FONT_SCALE] [--dpi DPI]
executor >  local (5)

paulmenzel commented 1 year ago

The problem seem to be empty input_file in the nanoplot module, when the last branch(?) is taken:

https://github.com/nf-core/nanoseq/blob/6e563e54362cddb8e48d15c156251708c22d0e8d/modules/nf-core/nanoplot/main.nf#L23-L31

No idea how to easily debug this.

paulmenzel commented 1 year ago

@gzentner, maybe you can update the issue title to module/nanoplot: Empty input_files leads to incorrect NanoPlot call.

sklages commented 1 year ago

@paulmenzel - just a wild guess: our user's input_file is named xxx001.fq.gz .. nf-core/nanoplot/main.nf looks for input file extensions ".fastq.gz" or ".txt" to deduce input type and thus to assign correct parameter, either --fastq or --summary. If none of these extensions is found, input_file = '', is empty string. That will cause NanoPlot to fail ...

So we should either fix nf file or input file naming ...

ashotmarg commented 5 months ago

Hi guys, I was also getting this error, when giving fq.gz file, instead of the expected fastq.gz file. I've slightly modified the nanoplot module file and created the relevant pull request for this. Changing the "def input_file" section to the below seem to have fixed it.

...
def input_file = ("$ontfile".endsWith(".fastq.gz") || "$ontfile".endsWith(".fq.gz")) ? "--fastq ${ontfile}" :
    ("$ontfile".endsWith(".txt")) ? "--summary ${ontfile}" : ''
...

nf-core / nanoseq