nf-core / mag

Assembly and binning of metagenomes
https://nf-co.re/mag
MIT License
210 stars 109 forks source link

`CAT_SUMMARY` fails due to input filename collision #474

Closed tillenglert closed 11 months ago

tillenglert commented 1 year ago

Description of the bug

Hi there,

I'm running nf-core/mag on a metagenomics dataset and wanted to include a taxonomic classification via CAT. The main module CAT runs, but then runs into the following issue:

nf-core/mag execution completed unsuccessfully!

The exit status of the task that caused the workflow execution to fail was: null.

The full error message was:

Error executing process > 'NFCORE_MAG:MAG:CAT_SUMMARY'

Caused by:
  Process `NFCORE_MAG:MAG:CAT_SUMMARY` input file name collision -- There are multiple input files for each of the following file names: SPAdes-DASTool-group-0.ORF2LCA.names.txt.gz, SPAdes-DASTool-group-0.bin2classification.names.txt.gz, MEGAHIT-DASTool-group-0.bin2classification.names.txt.gz, MEGAHIT-DASTool-group-0.ORF2LCA.names.txt.gz

Tip: view the complete command output by changing to the process work dir and entering the command `cat .command.out`

As the names suggest I'm using SPAdes and MEGAHIT, MaxBin2, Metabat2 and DAS Tool (Which seems to be the Problem here) and --postbinning_input "both".

I tracked down the problem to the naming convention within the CAT module:

https://github.com/nf-core/mag/blob/66cf53aff834d2a254b78b94fc54cd656b8b7b57/modules/local/cat.nf#L31

Which does not account for the naming of the unbinned DasTool file.

If you need any more information/files I can of course provide them.

Command used and terminal output

nextflow run nf-core/mag -r 2.3.0 -profile cfc --input "../samplesheet.csv" --host_genome "mm10" --save_hostremoved_reads --cat_db ../cat_db/CAT_prepare_20210107.tar.gz --coassemble_group --refine_bins_dastool --postbinning_input "both" --busco_auto_lineage_prok --save_busco_reference --busco_download_path "../busco-data.ezlab.org/v5/data" --skip_concoct --skip_prokka --outdir "results_with_cat" -c "../QMCOK_mag.config" --email till.englert@qbic.uni-tuebingen.de -resume

Relevant files

No response

System information

Nextflow version 23.04.1 build 5866 Hardware: HPC Cluster Executor: slurs Container engine: Singularity nf-core/mag: 2.3.0

jfy133 commented 1 year ago

A fix for this was incoming: https://github.com/nf-core/mag/pull/433 however the PR was closed (@maxibor ?)

tillenglert commented 1 year ago

Thanks @jfy133, I posted what I tracked down/found out to the PR. 👍

jfy133 commented 1 year ago

To summarise:

jfy133 commented 11 months ago

This in principle should be fixed in #489 and now in dev branch with a work around until we start replacing modules with offiical nf-core ones!

Will wait a week or so to see if we can get in a few more bug fixes then will release this.