Closed FranciscoDA closed 7 months ago
Hi,
it seems to me that this is a problem of the assembly/binning.
All bacterial user genomes have been filtered out.
Seems to say that all your genomes/bins are removed in QC filtering, consequently not results are written.
That seems to be backed up be the frequent failing BUSCO QC, because GTDB-Tk requires relatively good bins, and such good bins are selected based on BUSCO metrics. I'd guess that all of your bins do not qualify for GTDB-Tk (see also https://nf-co.re/mag/2.3.0/output#gtdb-tk which says nf-core/mag uses GTDB-Tk to classify binned genomes which satisfy certain quality criteria (i.e. completeness and contamination assessed with the BUSCO analysis).
).
That should be obvious from the file GenomeBinning/QC/busco_summary.tsv
in the results folder.
I believe there is a filter now for this - only assemblies with sufficient quality will reach classify_wf - and we print a warning if none pass the completenees filters
Description of the bug
Hello,
I've encountered a problem in the process defined at
NFCORE_MAG:MAG:GTDBTK:GTDBTK_CLASSIFY
while processing some single-end ONT whole-genome shotgun long reads from gut microbiome.I suspect there was an error in the GTDB-Tk classification where the expected output files were not written (no .classify.tree file in the process work dir). However, it seems that the log file from GTKDB-Tk did not show any errors so I'm not sure how to move forward with this issue.
It's also worth noting that the BUSCO analysis failed on 225 out of the 330 clusters because no genes could be found.
When running the workflow with the
--skin_binqc
argument, it will omit those process and complete successfully.Any pointers on how to debug this issue will be greatly appreciated.
Thanks.
EDIT: The md5sum from the downloaded GTDB-Tk database matches the hash listed at the uq.edu.au site:
Command used and terminal output
Relevant files
BUSCO process log files from each of the 330 nextflow work dirs: BUSCO.zip
GTKDBTK_CLASSIFY process log files from the nextflow work dir: GTDBTK_CLASSIFY.command.log gtdbtk.log gtdbtk.warnings.log
Custom config file used for this run: many-cpu.config.txt
System information