jhuapl-bio / taxtriage

TaxTriage is a Nextflow workflow designed to agnostically identify and classify microbial organisms within short- or long-read metagenomic NGS data. This flexible tool was developed with various use-cases of mNGS in mind.
MIT License
18 stars 4 forks source link

bin/merge_assemblies_conf.py assumes file content, no QA check #50

Closed chrisgulvik closed 7 months ago

chrisgulvik commented 7 months ago

Description of the bug

I ran a test (cmd below) where the outfile "BC05_flu.confidences.tsv" had the header but no second line content inside. So the python script failed with error:

Command error:
  Traceback (most recent call last):
    File "/my-local/.nextflow/assets/jhuapl-bio/taxtriage/bin/merge_assemblies_conf.py", line 108, in <module>
      main()
    File "/my-local/.nextflow/assets/jhuapl-bio/taxtriage/bin/merge_assemblies_conf.py", line 102, in main
      writer = csv.DictWriter(f, fieldnames=aggregated[0].keys(), delimiter='\t')
  IndexError: list index out of range

A test in this script prior to parsing it, or perhaps a QA test prior to it might be able to kill the wf reporting the error.

Command used and terminal output

nextflow run https://github.com/jhuapl-bio/taxtriage \
   --outdir tmp_viral \
   -resume \
   --input examples/Samplesheet.csv \
   --taxtab "default" \
   -r main \
   -latest \
   -profile local,singularity \
   --db /my-local/reference/kraken-databases/minusB

Relevant files

[21/6fb402] process > NFCORE_TAXTRIAGE:TAXTRIAGE:ALIGNMEN... [100%] 3 of 3 [- ] process > NFCORE_TAXTRIAGE:TAXTRIAGE:ALIGNMEN... - [- ] process > NFCORE_TAXTRIAGE:TAXTRIAGE:ALIGNMEN... - [5e/c55e0d] process > NFCORE_TAXTRIAGE:TAXTRIAGE:ALIGNMEN... [100%] 3 of 3 ✔ [a8/909432] process > NFCORE_TAXTRIAGE:TAXTRIAGE:ALIGNMEN... [100%] 4 of 4 ✔ [82/97500c] process > NFCORE_TAXTRIAGE:TAXTRIAGE:CONFIDEN... [100%] 4 of 4 ✔ [03/1f58a6] process > NFCORE_TAXTRIAGE:TAXTRIAGE:CONFIDEN... [100%] 4 of 4, failed: 1 ✘ [4a/13073e] process > NFCORETAXTRIAGE:TAXTRIAGE:CONVERT... [100%] 3 of 3 ✔ [- ] process > NFCORE_TAXTRIAGE:TAXTRIAGE:MERGE_CO... [ 0%] 0 of 1 [01/01eeeb] process > NFCORE_TAXTRIAGE:TAXTRIAGE:CUSTOM_D... [100%] 1 of 1 ✔ [- ] process > NFCORE_TAXTRIAGE:TAXTRIAGE:MULTIQC - Execution cancelled -- Finishing pending tasks before exit -[nf-core/taxtriage] Pipeline completed with errors- ERROR ~ Error executing process > 'NFCORE_TAXTRIAGE:TAXTRIAGE:CONFIDENCE_MERGE (BC05_flu)'

Caused by: Process NFCORE_TAXTRIAGE:TAXTRIAGE:CONFIDENCE_MERGE (BC05_flu) terminated with an error exit status (1)

Command executed:

merge_assemblies_conf.py \ -i BC05_flu.confidences.tsv \ -o BC05_flu.fullconfidences.tsv

cat <<-END_VERSIONS > versions.yml "NFCORE_TAXTRIAGE:TAXTRIAGE:CONFIDENCE_MERGE": python3: $(python3 --version | sed 's/Python //g') END_VERSIONS

Command exit status: 1

Command output: (empty)

Command error: Traceback (most recent call last): File "/my-local/.nextflow/assets/jhuapl-bio/taxtriage/bin/merge_assemblies_conf.py", line 108, in main() File "/my-local/.nextflow/assets/jhuapl-bio/taxtriage/bin/merge_assemblies_conf.py", line 102, in main writer = csv.DictWriter(f, fieldnames=aggregated[0].keys(), delimiter='\t') IndexError: list index out of range

System information

  N E X T F L O W
  version 23.10.0 build 5889
  created 15-10-2023 15:07 UTC (11:07 EDT)
  cite doi:10.1038/nbt.3820
  http://nextflow.io

local (no job scheduler)

singularity version 3.8.7-1.el7

Operating System: CentOS Stream 8 CPE OS Name: cpe:/o:centos:centos:8 Kernel: Linux 4.18.0-536.el8.x86_64 Architecture: x86-64

https://github.com/jhuapl-bio/taxtriage main branch version but also tested v1.2.2 tag with git checkout tags/v1.2.2

Merritt-Brian commented 7 months ago

Good catch, will implement a catch for empty files

Merritt-Brian commented 7 months ago

Completed with Recent push to main - Commit 39b04fc5a6d655cba9a3cf91485a36ab75d0ce9e