Ecogenomics / GTDBTk

GTDB-Tk: a toolkit for assigning objective taxonomic classifications to bacterial and archaeal genomes.
https://ecogenomics.github.io/GTDBTk/
GNU General Public License v3.0
470 stars 82 forks source link

Bacterial GTDB-Tk classification file does not exist error when using gtdb_to_ncbi_majority_vote.py #595

Open schmittel opened 3 months ago

schmittel commented 3 months ago

Hi,

I'm having the aforementioned issue. Here's the output I'm getting from gtdb_to_ncbi_majority_vote.py, including the parameters I used:

gtdb_to_ncbi_majority_vote.py v0.2.0: Translate GTDB to NCBI classification via majority vote.
  by Donovan Parks (donovan.parks@gmail.com)

[2024-06-22 07:40:06] INFO: GTDB to NCBI majority vote v0.2.0
[2024-06-22 07:40:06] INFO: gtdb_to_ncbi_majority_vote.py --gtdbtk_output_dir /gtdbtk/metagenomes_markers/classify --output_file /ncbi_conversion/gtdb_to_ncbi.txt --bac120_metadata_file /ncbi_conversion/bac120_metadata.tsv
[2024-06-22 07:40:06] INFO: Parsing GTDB-Tk classifications:
[2024-06-22 07:40:06] WARNING: Bacterial GTDB-Tk classification file does not exist.
[2024-06-22 07:40:06] WARNING: Assuming there are no bacterial genomes to reclassify.
[2024-06-22 07:40:06] INFO:  - identified 0 archaeal classifications
[2024-06-22 07:40:06] INFO:  - identified 0 bacterial classifications
[2024-06-22 07:40:06] INFO: Identifying GTDB-Tk classification trees:
[2024-06-22 07:40:06] INFO:  - identified 0 bacterial tree(s)
[2024-06-22 07:40:06] INFO: Parsing NCBI taxonomy from GTDB metadata files:
[2024-06-22 07:40:06] INFO: Processing bacterial metadata file.
[2024-06-22 07:40:22] INFO:  - read NCBI taxonomy for 584,382 genomes
[2024-06-22 07:40:22] INFO:  - identified 107,235 GTDB species clusters
[2024-06-22 07:40:22] INFO:  - identified genomes in 4,896 GTDB families
[2024-06-22 07:40:22] INFO: Determining NCBI majority vote classifications for GTDB species clusters.
[2024-06-22 07:40:24] INFO:  - identified 107,235 GTDB species clusters with an NCBI classification
[2024-06-22 07:40:24] INFO: Determining NCBI majority vote classification for each genome:
[2024-06-22 07:40:24] INFO: Results written to: /ebio/abt3_scratch/jmarsh/tract_score3/ncbi_conversion/gtdb_to_ncbi.txt
[2024-06-22 07:40:25] INFO: Done.

Here's the contents of my classify folder:

/gtdbtk/metagenomes_markers/classify/gtdbtk.ar53.classify.tree
/gtdbtk/metagenomes_markers/classify/gtdbtk.ar53.summary.tsv
/gtdbtk/metagenomes_markers/classify/gtdbtk.bac120.classify.tree.1.tree
/gtdbtk/metagenomes_markers/classify/gtdbtk.bac120.classify.tree.2.tree
/gtdbtk/metagenomes_markers/classify/gtdbtk.bac120.classify.tree.3.tree
/gtdbtk/metagenomes_markers/classify/gtdbtk.bac120.classify.tree.4.tree
/gtdbtk/metagenomes_markers/classify/gtdbtk.bac120.classify.tree.5.tree
/gtdbtk/metagenomes_markers/classify/gtdbtk.bac120.classify.tree.6.tree
/gtdbtk/metagenomes_markers/classify/gtdbtk.bac120.classify.tree.7.tree
/gtdbtk/metagenomes_markers/classify/gtdbtk.bac120.classify.tree.8.tree
/gtdbtk/metagenomes_markers/classify/gtdbtk.bac120.summary.tsv
/gtdbtk/metagenomes_markers/classify/gtdbtk.bac120.tree.mapping.tsv
/gtdbtk/metagenomes_markers/classify/gtdbtk.backbone.bac120.classify.tree

I can't figure out why this isn't working. I get the same error when using gtdb_to_ncbi_majority_vote.py version 0.2.0 and 0.2.1. Many thanks for your help.

donovan-h-parks commented 3 months ago

Hi,

Sorry for the slow reply. Were you able to resolve this issue?

Cheers, Donovan