nf-core / mag

Assembly and binning of metagenomes
https://nf-co.re/mag
MIT License
212 stars 109 forks source link

pipeline halts when quast finds no min length contigs #40

Closed ivelsko closed 1 year ago

ivelsko commented 4 years ago

Hi, I'm running mag and have run into a problem where it throws an error and stops running when quast reaches a contig file that has no contigs meeting the minimum length requirement. Is it possible to create a work around where quast skips these files and the pipe line moves on?

My command: nextflow run nf-core/mag \ --reads 'input/*.R{1,2}.fastq.gz' \ -profile shh \ --kraken2_db 'ftp://ftp.ccb.jhu.edu/pub/data/kraken2_dbs/minikraken2_v2_8GB_201904_UPDATE.tgz' \ --outdir 'output' \ -name 'cmc_assembly' \ -w 'output/work'

And the error: ERROR ~ Error executing process > 'quast (MEGAHIT-LIB050.A0105.SG1.1)'

Caused by: Process quast (MEGAHIT-LIB050.A0105.SG1.1) terminated with an error exit status (4)

Command executed:

metaquast.py --threads "1" --rna-finding --max-ref-number 0 -l "MEGAHIT-LIB050.A0105.SG1.1" "LIB050.A0105.SG1.1.contigs.fa" -o "LIB050.A0105.SG1.1_QC"

Command exit status: 4

Command output: /opt/conda/envs/nf-core-mag-1.0.0/lib/python3.6/site-packages/quast-5.0.2-py3.6.egg-info/scripts/metaquast.py --threads 1 --rna-finding --max-ref-number 0 -l MEGAHIT-LIB050.A0105.SG1.1 LIB050.A0105. SG1.1.contigs.fa -o LIB050.A0105.SG1.1_QC

Version: 5.0.2

System information: OS: Linux-4.4.0-38-generic-x86_64-with-debian-9.9 (linux_64) [23/12529] Python version: 3.6.7 CPUs number: 64

Started: 2020-02-19 13:07:10

Logging to LIB050.A0105.SG1.1_QC/metaquast.log

Contigs: Pre-processing... WARNING: Skipping MEGAHIT-LIB050.A0105.SG1.1 because it doesn't contain contigs >= 0 bp.

ERROR! None of the assembly files contains correct contigs. Please, provide different files or decrease --min-contig threshold.

Command wrapper: /opt/conda/envs/nf-core-mag-1.0.0/lib/python3.6/site-packages/quast-5.0.2-py3.6.egg-info/scripts/metaquast.py --threads 1 --rna-finding --max-ref-number 0 -l MEGAHIT-LIB050.A0105.SG1.1 LIB050.A0105. SG1.1.contigs.fa -o LIB050.A0105.SG1.1_QC

Version: 5.0.2 System information: [2/12529] OS: Linux-4.4.0-38-generic-x86_64-with-debian-9.9 (linux_64) Python version: 3.6.7 CPUs number: 64

Started: 2020-02-19 13:07:10

Logging to LIB050.A0105.SG1.1_QC/metaquast.log

Contigs: Pre-processing... WARNING: Skipping MEGAHIT-LIB050.A0105.SG1.1 because it doesn't contain contigs >= 0 bp.

ERROR! None of the assembly files contains correct contigs. Please, provide different files or decrease --min-contig threshold.

Work dir: /projects1/microbiome_calculus/Cameroon_plaque/04-analysis/assembly/output/work/3f/ff67506920c98fbed4f39ec6eda583

Tip: you can replicate the issue by changing to the process work dir and entering the command bash .command.run -- Check '.nextflow.log' file for details

d4straub commented 4 years ago

That's a little strange that the file contains no contigs at all. On the long run, it might be best to investigate how that happens and than correct that issue.

For fixing this immediately, you can try running your command appending -c quast.config where quast.config contains:

process {
  withName: quast {
    errorStrategy = { task.exitStatus in [143,137] ? 'retry' : 'ignore' }
  }
}

This should make the pipeline ignore the issue and move on. edit: dont forget to add -resume as well, otherwise you restart from scratch!

d4straub commented 3 years ago

Revisiting this issue, the assembly was not good enough, no contig at all. Most likely the underlying data is insufficient. Nevertheless, it would be of advantage if the pipeline would than skip quast, raise a warning and skip all downstream steps for that sample/assembly.

d4straub commented 1 year ago

I am not sure whether that is still an issue. Because the pipeline is supposed to be for assembly, I think its fine when the pipeline fails when no contig can be assembled. If this continues to be a concern, please re-open.