donovan-h-parks / PhyloRank

Assign taxonomic ranks based on evolutionary divergence.
GNU General Public License v3.0
21 stars 4 forks source link

IndexError while running 'decorate' on phylorank #5

Closed nohayoussef closed 6 years ago

nohayoussef commented 6 years ago

Hello, While running phylorank to first decorate a tree then calculate RED using the below commands phylorank decorate /scratch/nohayous/ForAnvio/tree_all.nwk /scratch/nohayous/ForAnvio/taxonomy_PR.txt /scratch/nohayous/ForAnvio/tree_all_dec.nwk --skip_rd_refine phylorank outliers /scratch/nohayous/ForAnvio/tree_all_dec.nwk /scratch/nohayous/ForAnvio/taxonomy_PR.txt /scratch/nohayous/ForAnvio/Phylorank

I got the following error: /opt/anaconda/4.0.0/lib/python2.7/site-packages/matplotlib/font_manager.py:273: UserWarning: Matplotlib is building the font cache using fc-list. This may take a moment. warnings.warn('Matplotlib is building the font cache using fc-list. This may take a moment.') [2018-07-10 15:39:21] INFO: PhyloRank v0.0.37 [2018-07-10 15:39:21] INFO: phylorank decorate /scratch/nohayous/ForAnvio/tree_all.nwk /scratch/nohayous/ForAnvio/taxonomy_PR.txt /scratch/nohayous/ForAnvio/tree_all_dec.nwk --skip_rd_refine [2018-07-10 15:39:21] INFO: Reading tree. [2018-07-10 15:39:21] INFO: Removing any previous internal node labels. [2018-07-10 15:39:21] INFO: Reading taxonomy. [2018-07-10 15:39:21] INFO: Calculating F-measure statistic for each taxa.

Unexpected error: <type 'exceptions.IndexError'> Traceback (most recent call last): File "/opt/anaconda/4.0.0/bin/phylorank", line 350, in parser.parse_options(args) File "/opt/anaconda/4.0.0/lib/python2.7/site-packages/phylorank/main.py", line 453, in parse_options self.decorate(options) File "/opt/anaconda/4.0.0/lib/python2.7/site-packages/phylorank/main.py", line 226, in decorate options.output_tree) File "/opt/anaconda/4.0.0/lib/python2.7/site-packages/phylorank/decorate.py", line 537, in run fmeasure_for_taxa = self._fmeasure(tree, taxonomy) File "/opt/anaconda/4.0.0/lib/python2.7/site-packages/phylorank/decorate.py", line 70, in _fmeasure extent_taxa_with_label[i] = Taxonomy().extant_taxa_for_rank(rank, taxonomy) File "/opt/anaconda/4.0.0/lib/python2.7/site-packages/biolib/taxonomy.py", line 661, in extant_taxa_for_rank if taxa[rank_index] != Taxonomy.rank_prefixes[rank_index]: IndexError: list index out of range [2018-07-10 15:39:22] INFO: PhyloRank v0.0.37 [2018-07-10 15:39:22] INFO: phylorank outliers /scratch/nohayous/ForAnvio/tree_all_dec.nwk /scratch/nohayous/ForAnvio/taxonomy_PR.txt /scratch/nohayous/ForAnvio/Phylorank [2018-07-10 15:39:22] ERROR: Input file does not exists: /scratch/nohayous/ForAnvio/tree_all_dec.nwk

Controlled exit resulting from an unrecoverable error or warning.

I am not sure what's going on. Could you possibly help with that?

Thanks Noha

donovan-h-parks commented 6 years ago

Hello Noha. Looks like the taxonomy strings in your Newick tree do not form a 7 rank taxonomy. If you are looking to classify genomes according to the methodology used by the GTDB, I recommend using our companion tool GTDB-Tk instead of PhyloRank.

nohayoussef commented 6 years ago

Thanks Donovan. Will do. Noha

Sent from my iPhone

On Jul 16, 2018, at 7:04 PM, Donovan Parks notifications@github.com wrote:

GTDB-Tk