gjospin / PhyloSift

Phylogenetic and taxonomic analysis for genomes and metagenomes
82 stars 18 forks source link

Problems understanding summary results #484

Open itiago opened 7 years ago

itiago commented 7 years ago

Hi I cannot understand the results that I have in summary files. The phylogenetic classification doesn't seems to have any hierarchy, the phylogenetic groups seem scattered. Is there a way, a output that I can use with percentage of each phylogenetic group determined? Or at least a way that the phylogenetic groups are listed from order phylum class family genus species. Please see bellow the out put of taxa_90pct_HPD that I've got. Thank you for any help best

Taxon_ID Taxon_Rank Taxon_Name Probability_Mass

2323 no rank UNCLASSIFIED BACTERIA 67.7550808846286 186801 class CLOSTRIDIA 48.1818926615539 28890 phylum EURYARCHAEOTA 43.8176742723694 180252 genus MARDIVIRUS 42.239998171038 2 superkingdom BACTERIA 38.6765763967502 74152 phylum ELUSIMICROBIA 36.923412759582 2157 superkingdom ARCHAEA 29.8294179324969 28221 class DELTAPROTEOBACTERIA 23.512760549098 10294 genus SIMPLEXVIRUS 20.224253266672 10319 genus VARICELLOVIRUS 17.8272728992329 976 phylum BACTEROIDETES 17.3965667145503 200795 phylum CHLOROFLEXI 16.5205306714165 2759 superkingdom EUKARYOTA 15.8109796175687 1224 phylum PROTEOBACTERIA 14.9408172660658 80840 order BURKHOLDERIALES 14.6763287712078 104388 species ANATID HERPESVIRUS 1 14.2991791629061 203682 phylum PLANCTOMYCETES 14.1617497761839 221216 phylum PARCUBACTERIA 14.1429672734848 1046984 no rank UNCLASSIFIED PARCUBACTERIA 14.1429672734848 1046989 species PARCUBACTERIA BACTERIUM SCGC AAA011 J21 14.1429672734848 67820 no rank MICROGENOMATES 14.1348618842177 1046980 no rank UNCLASSIFIED MICROGENOMATES 14.1348618842177 80864 family COMAMONADACEAE 13.5543004201798 481315 species LEPORID HERPESVIRUS 4 10.719149311938 119065 no rank UNCLASSIFIED BURKHOLDERIALES 10.0476656257464 539000 family CLOSTRIDIALES FAMILY XVII. INCERTAE SEDIS 9.9027432647522 118964 order DEINOCOCCALES 9.35162699120047 186802 order CLOSTRIDIALES 9.09298136527913 1047003 no rank CANDIDATUS MICROGENOMATUS AURICOLA SCGC AAA011 E14 8.468109215189 1073573 species SAR324 CLUSTER BACTERIUM JCVI SC AAA005 8.33911229617494 995019 family SUTTERELLACEAE 8.2543571913185 35246 species CERCOPITHECINE HERPESVIRUS 9 8.00920190079276 1047061 species MARINIMICROBIA BACTERIUM JGI 0000059 L03 7.94884437638614 301297 class DEHALOCOCCOIDIA 7.5724624985584 180255 genus ILTOVIRUS 7.28077566279544 10386 species GALLID HERPESVIRUS 1 7.28077566279544 117743 class FLAVOBACTERIIA 7.25986974790918 189775 class THERMOMICROBIA 7.1792062248773 213468 family SYNTROPHACEAE 7.03972811623339 1047055 species MARINIMICROBIA BACTERIUM JGI 0000039 D08 6.9827342507313 699262 species MICROGENOMATES BACTERIUM SCGC AAA011 L6 6.8592697503376 255727 suborder CORIOBACTERINEAE 6.82408690691444 84999 order CORIOBACTERIALES 6.82408690691444 84107 family CORIOBACTERIACEAE 6.82408690691444 84998 subclass CORIOBACTERIDAE 6.82408690691444 68335 genus COPROTHERMOBACTER 6.78007108867223 28216 class BETAPROTEOBACTERIA 6.67683809664979 427925 genus DETHIOBACTER 6.20600555070151 555088 no rank DETHIOBACTER ALKALIPHILUS AHT 1 6.20600555070151 427926 species DETHIOBACTER ALKALIPHILUS 6.20600555070151 10240 family POXVIRIDAE 5.85215325763436 10338 no rank HUMAN HERPESVIRUS 3 STRAIN DUMAS 5.74938294322898 10335 species HUMAN HERPESVIRUS 3 5.74938294322898 1117 phylum CYANOBACTERIA 5.67965709984567 2037 order ACTINOMYCETALES 5.67836848598464 117942 family DESULFURELLACEAE 5.5551246031969 213113 order DESULFURELLALES 5.5551246031969 68336 superphylum BACTEROIDETES/CHLOROBI GROUP 5.44635723868172 77133 species UNCULTURED BACTERIUM 5.273749733226 189384 genus CANDIDATUS TREMBLAYA 4.8009030658576 1239 phylum FIRMICUTES 4.52940696115807 62680 phylum MARINIMICROBIA 4.4903785383545 186814 family THERMOANAEROBACTERACEAE 4.48966024436195 1046981 no rank UNCLASSIFIED CANDIDATE DIVISION NKB19 4.4676703441163 142187 no rank CANDIDATE DIVISION NKB19 4.4676703441163 1046982 species HYDROGENEDENTES BACTERIUM JGI 0000039 J10 4.4676703441163 28067 genus RUBRIVIVAX 4.3801689605918 160619 species KRYPTOPERIDINIUM FOLIACEUM 4.3215639104224 160618 genus KRYPTOPERIDINIUM 4.3215639104224 28262 species THERMODESULFOVIBRIO YELLOWSTONII 4.23788520910848 28261 genus THERMODESULFOVIBRIO 4.23788520910848 289376 no rank THERMODESULFOVIBRIO YELLOWSTONII DSM 11347 4.23788520910848 224471 no rank BURKHOLDERIALES GENERA INCERTAE SEDIS 4.0959782666846 179 genus LEPTOSPIRILLUM 4.071959150315 400756 species DURINSKIA BALTICA 4.01277392951055 400754 genus DURINSKIA 4.01277392951055 1269028 no rank ACANTHAMOEBA POLYPHAGA MOUMOUVIRUS 3.72460182397907 112 order PLANCTOMYCETALES 3.56787827617837 126 family PLANCTOMYCETACEAE 3.56787827617837 94695 order METHANOSARCINALES 3.48363248555097 2258 order THERMOCOCCALES 3.33756684761244 2259 family THERMOCOCCACEAE 3.33756684761244 183968 class THERMOCOCCI 3.33756684761244 1385 order BACILLALES 3.25644370474083 212035 species ACANTHAMOEBA POLYPHAGA MIMIVIRUS 3.25394398860425 315393 genus MIMIVIRUS 3.25394398860425 265317 phylum PORIBACTERIA 3.19198513074598 700750 species CANDIDATUS PORIBACTERIA SP. WGA A3 3.19198513074598 10239 superkingdom VIRUSES 3.1289035093059 68295 order THERMOANAEROBACTERALES 3.1158362749958 35237 no rank DSDNA VIRUSES NO RNA STAGE 3.0905622273651 420506 species PSEUDOMONAS SP. TAP 9 3.084096749673 131567 no rank CELLULAR ORGANISMS 3.06928714788578 1247379 no rank MOUMOUVIRUS GOULETTE 3.02345894914011 1236 class GAMMAPROTEOBACTERIA 2.993300262158 795665 species HYDROGENOPHAGA SP. PBC 2.9760634553058 47420 genus HYDROGENOPHAGA 2.9760634553058 29547 class EPSILONPROTEOBACTERIA 2.94077337959297 412449 species LEPTOSPIRILLUM FERRODIAZOTROPHUM 2.8852383909734 49546 family FLAVOBACTERIACEAE 2.857035792411 40117 phylum NITROSPIRAE 2.73045211348307 189778 order NITROSPIRALES 2.73045211348307 189779 family NITROSPIRACEAE 2.73045211348307 203693 class NITROSPIRA 2.73045211348307 651137 phylum THAUMARCHAEOTA 2.720357514511 2191 order METHANOMICROBIALES 2.71472202727902 2158 order METHANOBACTERIALES 2.6832911993001 183925 class METHANOBACTERIA 2.6832911993001 57723 phylum ACIDOBACTERIA 2.6033722627672 28211 class ALPHAPROTEOBACTERIA 2.56217814319635 200643 class BACTEROIDIA 2.55949863836525 171549 order BACTEROIDALES 2.55949863836525 69541 order DESULFUROMONADALES 2.53151307004127 543371 family THERMOANAEROBACTERALES FAMILY III. INCERTAE SEDIS 2.52594094975175 29 order MYXOCOCCALES 2.49250615331 985780 no rank UNCLASSIFIED MIMIVIRIDAE 2.43786244480941 67818 phylum ATRIBACTERIA 2.37674806154577 1047049 no rank UNCLASSIFIED ATRIBACTERIA 2.37674806154577 227387 family THERMODESULFOBIACEAE 2.20485189984919 589342 order PYRENOMONADALES 2.16901587900483 203683 class PLANCTOMYCETIA 2.1045001835608 1047007 species OMNITROPHICA BACTERIUM SCGC AAA257 O07 2.04652792031222 1047005 no rank UNCLASSIFIED OMNITROPHICA 2.04652792031222 67812 phylum OMNITROPHICA 2.04652792031222 1131291 species GALLIONELLA SP. SCGC AAA018 N21 2.04507257424681 538999 no rank CLOSTRIDIALES INCERTAE SEDIS 2.01388705743812 985782 species MOUMOUVIRUS 2.00930380869041 1150 order OSCILLATORIALES 2 72294 family CAMPYLOBACTERACEAE 2 662 genus VIBRIO 1.9999999999997 38145 species UNCULTURED GRAM POSITIVE BACTERIUM 1.999018646003 658858 no rank GIARDIA LAMBLIA P15 1.979614208809 145522 species NANNOCHLOROPSIS OCEANICA 1.970341288609 10379 genus RHADINOVIRUS 1.9416853757266 10345 species SUID HERPESVIRUS 1 1.9272726383799 641853 class ELUSIMICROBIA 1.92598437604264 423604 genus ELUSIMICROBIUM 1.92598437604264 641854 order ELUSIMICROBIALES 1.92598437604264 423605 species ELUSIMICROBIUM MINUTUM 1.92598437604264 641876 family ELUSIMICROBIACEAE 1.92598437604264 445932 no rank ELUSIMICROBIUM MINUTUM PEI191 1.92598437604264 10257 genus PARAPOXVIRUS 1.9221561426076 795748 order IGNAVIBACTERIALES 1.9155713339131 1134404 phylum IGNAVIBACTERIAE 1.9155713339131 795747 class IGNAVIBACTERIA 1.9155713339131 51290 superphylum CHLAMYDIAE/VERRUCOMICROBIA GROUP 1.8769598423322 28033 genus SULFOBACILLUS 1.8687664451935 135622 order ALTEROMONADALES 1.8615611327795 224027 family HYDROGENOTHERMACEAE 1.834863113216 637379 no rank THALASSIOSIRA OCEANICA CCMP1005 1.8132586555482 159749 species THALASSIOSIRA OCEANICA 1.8132586555482 1010676 species CANDIDATUS TREMBLAYA PHENACOLA 1.8103729462412 1266371 no rank CANDIDATUS TREMBLAYA PHENACOLA PAVE 1.8103729462412 213462 order SYNTROPHOBACTERALES 1.80928924602559 184922 no rank GIARDIA LAMBLIA ATCC 50803 1.80877747551745 183967 class THERMOPLASMATA 1.786798194799 2301 order THERMOPLASMATALES 1.786798194799 471821 no rank UNCULTURED TERMITE GROUP 1 BACTERIUM PHYLOTYPE RS D17 1.71368389281153 99260 no rank ENVIRONMENTAL SAMPLES 1.71368389281153 167965 species UNCULTURED TERMITE GROUP 1 BACTERIUM 1.71368389281153 2272 family DESULFUROCOCCACEAE 1.6840380780517 84995 subclass RUBROBACTERIDAE 1.68010734495564 91061 class BACILLI 1.63280573752817 224462 suborder NANNOCYSTINEAE 1.6272915300453 987059 no rank RUBRIVIVAX BENZOATILYTICUS JA2 = ATCC BAA 35 1.53767500001317 316997 species RUBRIVIVAX BENZOATILYTICUS 1.53767500001317 1297 phylum DEINOCOCCUS THERMUS 1.53484525470542 188787 class DEINOCOCCI 1.53484525470542 6029 phylum MICROSPORIDIA 1.52388461413431 2237 genus HALOARCULA 1.4549382304459 2787 species PORPHYRA PURPUREA 1.44652204579915 2784 genus PORPHYRA 1.44652204579915 1094566 genus PYROPIA 1.39698839390905 2788 species PYROPIA YEZOENSIS 1.39698839390905 942 family ANAPLASMATACEAE 1.39296213040913 170 family LEPTOSPIRACEAE 1.392315410263 456828 phylum CLOACIMONETES 1.38754317024111 1047070 species CLOACIMONETES BACTERIUM JGI 0000039 G13 1.38754317024111 1047068 no rank UNCLASSIFIED CLOACIMONETES 1.38754317024111 191767 genus PLESIOCYSTIS 1.36607931114235 191768 species PLESIOCYSTIS PACIFICA 1.36607931114235 391625 no rank PLESIOCYSTIS PACIFICA SIR 1 1.36607931114235 224463 family NANNOCYSTACEAE 1.36607931114235 1760 class ACTINOBACTERIA 1.35673083698003 201174 phylum ACTINOBACTERIA 1.35673083698003 872 genus DESULFOVIBRIO 1.336824002792 10242 genus ORTHOPOXVIRUS 1.3215620149395 247490 species PLANCTOMYCETE KSU 1 1.3021940557083 69476 no rank UNCLASSIFIED PLANCTOMYCETACEAE 1.3021940557083 330214 species CANDIDATUS NITROSPIRA DEFLUVII 1.27389721904703 1234 genus NITROSPIRA 1.27389721904703 755731 species CLOSTRIDIUM SP. BNL1100 1.2601456810057 2222 genus METHANOSAETA 1.24147524584447 143067 family METHANOSAETACEAE 1.24147524584447 10353 species SAIMIRIINE HERPESVIRUS 1 1.21878041873767 354090 species UR2 SARCOMA VIRUS 1.214774232221 11884 species Y73 SARCOMA VIRUS 1.214774232221 598745 no rank GIARDIA INTESTINALIS ATCC 50581 1.2132565524329 67799 class THERMODESULFOBACTERIA 1.21092396017029 188710 order THERMODESULFOBACTERIALES 1.21092396017029 200940 phylum THERMODESULFOBACTERIA 1.21092396017029 188711 family THERMODESULFOBACTERIACEAE 1.21092396017029 1118 order CHROOCOCCALES 1.1658239059307 40544 genus SUTTERELLA 1.1616687908745 119060 family BURKHOLDERIACEAE 1.1114885419057 70448 species OSTREOCOCCUS TAURI 1.106822045131 28889 phylum CRENARCHAEOTA 1.0995984079726 2806 class FLORIDEOPHYCEAE 1.09653928955472 913317 species ATRIBACTERIA BACTERIUM SCGC AAA252 M02 1.09285041937267 1265734 species RHODOPIRELLULA MAIORICA 1.05493140367965 1265738 no rank RHODOPIRELLULA MAIORICA SM1 1.05493140367965 33807 no rank UNCLASSIFIED ALPHAPROTEOBACTERIA MISCELLANEOUS 1.049145183805 589343 family GEMINIGERACEAE 1.04871424088512 55529 species GUILLARDIA THETA 1.04871424088512 55528 genus GUILLARDIA 1.04871424088512 1051663 class NANOHALOARCHAEA 1.0406524871422 32069 order AQUIFICALES 1.02453410725023 200783 phylum AQUIFICAE 1.02453410725023 187857 class AQUIFICAE 1.02453410725023 1047051 species ATRIBACTERIA BACTERIUM SCGC AAA255 G05 1.0137548519169 913331 species SAR324 CLUSTER BACTERIUM SCGC AAA240 J09 1.000010255579 186823 family ALICYCLOBACILLACEAE 1.0000000000004 89374 family SAPROSPIRACEAE 1 766 order RICKETTSIALES 1 1485 genus CLOSTRIDIUM 1 1169103 species MOCKFORDIA XANTHOCAECILIAE 1 332054 species TRICHOPLUSIA NI SINGLE NUCLEOPOLYHEDROVIRUS 1 662758 genus CANDIDATUS PARVARCHAEUM 1 379 genus RHIZOBIUM 1 1279 genus STAPHYLOCOCCUS 1 5794 phylum APICOMPLEXA 1 913329 species SAR324 CLUSTER BACTERIUM SCGC AAA001 C10 0.999989744421 379546 genus ACIDULIPROFUNDUM 0.9997194925934 909929 order SELENOMONADALES 0.988311817725275 909932 class NEGATIVICUTES 0.988311817725275 1048260 no rank LEPTOSPIRILLUM FERRIPHILUM ML 04 0.984775747824265 178606 species LEPTOSPIRILLUM FERRIPHILUM 0.984775747824265 189385 species CANDIDATUS TREMBLAYA PRINCEPS 0.98455926915 891398 no rank CANDIDATUS TREMBLAYA PRINCEPS PCIT 0.98455926915 580370 class ZETAPROTEOBACTERIA 0.965937643663 89373 family CYTOPHAGACEAE 0.965303468128 1293498 class NITROSPINIA 0.94616966965988 407032 family NITROSPINACEAE 0.94616966965988 35800 genus NITROSPINA 0.94616966965988 1293499 order NITROSPINALES 0.94616966965988 1293497 phylum NITROSPINAE 0.94616966965988 939844 species MARINIMICROBIA BACTERIUM SCGC AAA003 L8 0.945562869442 1293577 species CANDIDATE DIVISION SR1 BACTERIUM MGEHA 0.93495366769275 221235 no rank CANDIDATE DIVISION SR1 0.93495366769275 53433 order HALANAEROBIALES 0.9311695915812 2263 genus THERMOCOCCUS 0.930215266329778 180 species LEPTOSPIRILLUM FERROOXIDANS 0.912449441838557 1162668 no rank LEPTOSPIRILLUM FERROOXIDANS C2 3 0.912449441838557 655606 species group LEPTOSPIRILLUM SP. GROUP I 0.912449441838557 378210 genus METHYLOVERSATILIS 0.90375368264 1095747 no rank FUSOBACTERIUM NECROPHORUM SUBSP. FUNDULIFORME ATCC 51357 0.899323322414 213481 order BDELLOVIBRIONALES 0.895564693138 191412 family CHLOROBIACEAE 0.82390431245765 191411 order CHLOROBIALES 0.82390431245765 1090 phylum CHLOROBI 0.82390431245765 191410 class CHLOROBIA 0.82390431245765 79206 genus DESULFOSPOROSINUS 0.8161058434098 1332188 species CANDIDATUS SACCHARIMONAS AALBORGENSIS 0.812767914254 95818 phylum CANDIDATUS SACCHARIBACTERIA 0.812767914254 1331051 genus CANDIDATUS SACCHARIMONAS 0.812767914254 51368 no rank UNCLASSIFIED DSDNA VIRUSES 0.78695671726095 508458 phylum SYNERGISTETES 0.766508356538 1176514 no rank BURKHOLDERIA GLUMAE AU6208 0.756447093936 85003 subclass ACTINOBACTERIDAE 0.750094927835643 662756 genus CANDIDATUS MICRARCHAEUM 0.7389033472169 425595 no rank CANDIDATUS MICRARCHAEUM ACIDIPHILUM ARMAN 2 0.7389033472169 662757 species CANDIDATUS MICRARCHAEUM ACIDIPHILUM 0.7389033472169 1297582 no rank CANDIDATUS PORTIERA ALEYRODIDARUM TV 0.7384721953327 65047 genus MITSUARIA 0.7366995882105 557855 species MITSUARIA SP. H24L5A 0.7366995882105 693272 species CAFETERIA ROENBERGENSIS VIRUS BV PW1 0.71340237203445 191393 order DEFERRIBACTERALES 0.706131530240114 200930 phylum DEFERRIBACTERES 0.706131530240114 68337 class DEFERRIBACTERES 0.706131530240114 191394 family DEFERRIBACTERACEAE 0.706131530240114 1144315 species VARIOVORAX SP. CF313 0.697761069402 44249 genus PAENIBACILLUS 0.6884149870332 1239881 no rank CANDIDATUS PORTIERA ALEYRODIDARUM BT QVLC 0.686899682077 506 family ALCALIGENACEAE 0.667517561103 208447 genus OCEANITHERMUS 0.666666666666667 187137 species OCEANITHERMUS PROFUNDUS 0.666666666666667 670487 no rank OCEANITHERMUS PROFUNDUS DSM 14977 0.666666666666667 203486 class DICTYOGLOMIA 0.660056056043753 203487 order DICTYOGLOMALES 0.660056056043753 13 genus DICTYOGLOMUS 0.660056056043753 203488 family DICTYOGLOMACEAE 0.660056056043753 68297 phylum DICTYOGLOMI 0.660056056043753 28219 genus BRACHYMONAS 0.65896581460957 1121116 no rank BRACHYMONAS CHIRONOMI DSM 19884 0.65896581460957 491919 species BRACHYMONAS CHIRONOMI 0.65896581460957 44000 genus CALDICELLULOSIRUPTOR 0.6511141873124 649638 no rank TRUEPERA RADIOVICTRIX DSM 17093 0.64127567799025 332247 family TRUEPERACEAE 0.64127567799025 332248 genus TRUEPERA 0.64127567799025 332249 species TRUEPERA RADIOVICTRIX 0.64127567799025 213121 family DESULFOBULBACEAE 0.632507265463 693075 species CALDISERICUM EXILE 0.628852808829214 511051 no rank CALDISERICUM EXILE AZM16C01 0.628852808829214 67814 phylum CALDISERICA 0.628852808829214 693073 family CALDISERICACEAE 0.628852808829214 693071 class CALDISERICIA 0.628852808829214 693074 genus CALDISERICUM 0.628852808829214 693072 order CALDISERICALES 0.628852808829214 2232 family ARCHAEOGLOBACEAE 0.590864961582667