donovan-h-parks / PhyloRank

Assign taxonomic ranks based on evolutionary divergence.
GNU General Public License v3.0
21 stars 4 forks source link

meet a IndexError #10

Closed fujch7 closed 4 years ago

fujch7 commented 4 years ago

When I calculate RED, I met a IndexError: list index out of range

Traceback (most recent call last): File "/usr/bin/phylorank", line 360, in parser.parse_options(args) File "/root/miniconda3/lib/python3.7/site-packages/phylorank/main.py", line 484, in parse_options self.outliers(options) File "/root/miniconda3/lib/python3.7/site-packages/phylorank/main.py", line 91, in outliers options.verbose_table) File "/root/miniconda3/lib/python3.7/site-packages/phylorank/outliers.py", line 885, in run taxonomy = Taxonomy().read(taxonomy_file) File "/root/miniconda3/lib/python3.7/site-packages/biolib/taxonomy.py", line 810, in read tax_str = line_split[1].rstrip() IndexError: list index out of range

Jiacheng

donovan-h-parks commented 4 years ago

This looks like an issue with your taxonomy file. It has to contain exactly 7 ranks (d, p, ..., s__) for every genome.

fujch7 commented 4 years ago

[2020-04-09 02:46:11] INFO: PhyloRank v0.1.0 [2020-04-09 02:46:11] INFO: phylorank decorate /mnt/hgfs/Share/0408_38.tree.nwk /mnt/hgfs/Share/taxonamy_file.tsv /mnt/hgfs/Share/ [2020-04-09 02:46:11] INFO: Reading tree. [2020-04-09 02:46:11] INFO: Removing any previous internal node labels. [2020-04-09 02:46:11] INFO: Reading taxonomy. [2020-04-09 02:46:11] INFO: Calculating F-measure statistic for each taxa. [2020-04-09 02:46:11] INFO: Calculating taxa within each lineage. [2020-04-09 02:46:11] INFO: Processing 1 taxa at Domain rank. [2020-04-09 02:46:11] INFO: Processing 2 taxa at Phylum rank. [2020-04-09 02:46:11] INFO: Processing 4 taxa at Class rank. [2020-04-09 02:46:11] INFO: Processing 4 taxa at Order rank. [2020-04-09 02:46:11] INFO: Processing 6 taxa at Family rank. [2020-04-09 02:46:11] INFO: Processing 10 taxa at Genus rank. [2020-04-09 02:46:11] INFO: Processing 4 taxa at Species rank. [2020-04-09 02:46:11] INFO: Placing labels with unambiguous position in tree. [2020-04-09 02:46:11] INFO: Establishing median relative divergence for taxonomic ranks. [2020-04-09 02:46:11] INFO: Identified 1 phyla. [2020-04-09 02:46:11] INFO: Using 1 phyla as rootings for inferring distributions. [2020-04-09 02:46:11] ERROR: Rescaling requires at least 2 valid phyla.

Controlled exit resulting from an unrecoverable error or warning.

Thank for your patient. And I met another problem. My genomes belong to phylum Acidobacteria and strain Gimesia_maris_DSM8797 used as an outgroup belongs to phylum Planctomycetes. When I using phylorank decorate, there is still an error 'Rescaling requires at least 2 valid phyla'.

donovan-h-parks commented 4 years ago

Hi. As discussed in our other thread, PhyloRank is currently designed to work with trees that span an entire domain. The expected RED for a taxonomic rank is taken as the median RED value of different plausible rooting of the tree (namely, above well-establish phyla). See the GTDB manuscript for more details. If you do not wish to follow this methodology, you can get PhyloRank to calculate RED values with a fixed root using the --fixed_root. You can then compare the RED value across taxa with this fixed rooting which is sensible so long as you trust this rooting.