marbl / Krona

Interactively explore metagenomes and more from a web browser.
https://github.com/marbl/Krona/wiki
457 stars 101 forks source link

Import results form MAPseq #144

Open sarah872 opened 4 years ago

sarah872 commented 4 years ago

Is there a way to import the results from MAPseq?

Here is some example output:

# mapseq v1.2.6 (Mar 24 2020)
#query  dbhit   bitscore    identity    matches mismatches  gaps    query_start query_end   dbhit_start dbhit_end   strand      SILVA_138_SSURef_NR99_tax_silva.fasta.tax:taxlevel0 combined_cf score_cf    taxlevel1   combined_cf score_cf    taxlevel2   combined_cf score_cf    taxlevel3   combined_cf score_cf    taxlevel4   combined_cf score_cf    taxlevel5   combined_cf score_cf    taxlevel6   combined_cf score_cf    taxlevel7   combined_cf score_cf    taxlevel8   combined_cf score_cf    taxlevel9   combined_cf score_cf    taxlevel10  combined_cf score_cf    taxlevel11  combined_cf score_cf    taxlevel12  combined_cf score_cf    taxlevel13  combined_cf score_cf    taxlevel14  combined_cf score_cf    taxlevel15  combined_cf score_cf    taxlevel16  combined_cf score_cf    taxlevel17  combined_cf score_cf    taxlevel18  combined_cf score_cf    taxlevel19  combined_cf score_cf    
J00137:56:H7VGNBBXX:8:1110:17594:25333  EU768871.1.1745 151 1   151 0   0   0   151 1516    1667    -       Eukaryota   1   1   Amorphea    1   1   Obazoa  1   1   Opisthokonta    1   1   Holozoa 1   1   Choanozoa   1   1   Metazoa 1   1   Animalia    1   1   BCP clade   1   1   Bilateria   1   1   Protostomia 1   1   Ecdysozoa   1   1   Nematozoa   1   1   Nematoda    1   1   Chromadorea 0.94406933  0.9440693425827097  Desmodorida 0.94335705  0.9433570441893419  Robbea sp. 2 SB-2008    0.80237269  0.8023726869461876  NA  0   0   NA  0   0   NA  0   0   
J00137:56:H7VGNBBXX:8:1114:1052:24736   EU768871.1.1745 147 0.9867549538612366  149 2   0   0   151 1530    1681    +       Eukaryota   1   1   Amorphea    1   1   Obazoa  1   1   Opisthokonta    1   1   Holozoa 1   1   Choanozoa   1   1   Metazoa 1   1   Animalia    1   1   BCP clade   1   1   Bilateria   1   1   Protostomia 0.99967039  0.9996703803201165  Ecdysozoa   0.99967039  0.9996703803201165  Nematozoa   0.99966425  0.999664240195316   Nematoda    0.99966425  0.999664240195316   Chromadorea 0.99966425  0.999664240195316   Desmodorida 0.99834794  0.9983479477564231  Robbea sp. 2 SB-2008    0.79507071  0.7950707078653696  NA  0   0   NA  0   0   NA  0   0   
J00137:56:H7VGNBBXX:8:1105:8369:32947   Y16912.1.1846   149 1   149 0   0   0   149 546 695 +       Eukaryota   1   1   Amorphea    1   1   Obazoa  1   1   Opisthokonta    1   1   Holozoa 1   1   Choanozoa   1   1   Metazoa 1   1   Animalia    1   1   BCP clade   1   1   Bilateria   1   1   Protostomia 1   1   Ecdysozoa   1   1   Nematozoa   1   1   Nematoda    1   1   Chromadorea 1   1   Desmodorida 1   1   Catanema sp.    0.1720777   0.1720777057694312  NA  0   0   NA  0   0   NA  0   0   
J00137:56:H7VGNBBXX:8:1103:17614:11020  KU696458.1.1760 95  0.8571428656578064  132 18  4   0   151 131 284 -       Eukaryota   1   1   Amorphea    1   1   Obazoa  1   1   Opisthokonta    1   1   Holozoa 1   1   Choanozoa   1   1   Metazoa 1   1   Animalia    1   1   BCP clade   1   1   Bilateria   1   1   Protostomia 1   1   Ecdysozoa   1   0.9999999887745773  Arthropoda  0.65591925  0.6559192387686087  Chelicerata 0.65351921  0.6535192377154526  Arachnida   0.65351921  0.6535192377154523  Pseudoscorpiones    0.65253949  0.6525394887467022  Feaella callani 0.62707865  0.6270786690714641  NA  0   0   NA  0   0   NA  0   0   
J00137:56:H7VGNBBXX:8:1101:9557:2510    JF293045.1.1792 147 0.9867549538612366  149 2   0   0   151 1165    1316    -       Eukaryota   1   1   Amorphea    1   1   Obazoa  1   1   Opisthokonta    1   1   Holozoa 1   1   Choanozoa   1   1   Metazoa 1   1   Animalia    1   1   BCP clade   1   1   Bilateria   1   1   Protostomia 1   1   Lophotrochozoa  0.69828844  0.6982884496017068  Nemertea    0.35412875  0.3541287347179816  Anopla  0.35412875  0.3541287347179816  Heteronemertea  0.24119009  0.2411900918330049  Zygeupolia rubens   0.24119009  0.2411900918330049  NA  0   0   NA  0   0   NA  0   0   NA  0   0   
J00137:56:H7VGNBBXX:8:1108:21308:3530   FJ182217.1.1749 135 0.9470198750495911  143 8   0   0   151 1247    1398    +       Eukaryota   1   1   Amorphea    1   1   Obazoa  1   1   Opisthokonta    1   1   Holozoa 1   1   Choanozoa   1   1   Metazoa 1   1   Animalia    1   1   BCP clade   1   1   Bilateria   1   1   Protostomia 1   1   Ecdysozoa   0.99972749  0.9997274857689482  Nematozoa   0.99972749  0.9997274851880048  Nematoda    0.99972749  0.9997274851880048  Chromadorea 0.99972749  0.9997274851880048  Desmodorida 0.99972749  0.9997274851880048  Draconema japonicum 0.15942901  0.1594290201998433  NA  0   0   NA  0   0   NA  0   0   
J00137:56:H7VGNBBXX:8:1118:29072:46205  Y16912.1.1846   91  0.9252336621284485  99  8   0   44  151 1725    1832    -       Eukaryota   1   1   Amorphea    1   0.9999999995444129  Obazoa  1   0.9999999995444129  Opisthokonta    1   0.9999999995444129  Holozoa 1   0.9999999994839754  Choanozoa   1   0.9999999992422257  Metazoa 1   0.9999999994839754  Animalia    1   0.999999998847076   BCP clade   1   0.9999999999395626  Bilateria   1   0.9999999999395626  Protostomia 1   0.9999999999395626  Ecdysozoa   1   0.9999999999374327  Nematozoa   1   0.9999999998769953  Nematoda    1   0.9999999998769953  Chromadorea 1   0.9999999999367977  Desmodorida 1   0.9999999999367977  Catanema sp.    0.79223835  0.7922383475302879  NA  0   0   NA  0   0   NA  0   0   
J00137:56:H7VGNBBXX:8:1112:26748:19249  KY016382.1.1714 94  0.8714285492897034  122 15  3   13  151 188 327 +       Eukaryota   1   1   Amorphea    1   1   Obazoa  1   1   Opisthokonta    1   1   Holozoa 1   1   Choanozoa   1   1   Metazoa 1   1   Animalia    1   1   BCP clade   1   1   Bilateria   1   1   Protostomia 1   1   Ecdysozoa   1   1   Arthropoda  1   1   Chelicerata 1   1   Arachnida   1   1   Araneae 1   1   Calymmaria sp. CG231    0.12105156  0.1210515667662762  NA  0   0   NA  0   0   NA  0   0   

This can be transformed into an OTU count table (for 4 samples here):

# mapseq v1.2.6 (Mar 24 2020)
#TotalCounts:   5131    5109    4108    4065
    sample1 sample2 sample3 sample4
Eukaryota;Amorphea;Obazoa;Opisthokonta  4183    4135    3098    3068
Bacteria;Proteobacteria;Gammaproteobacteria;Oceanospirillales   11  8   10  7
Bacteria;Proteobacteria;Gammaproteobacteria;Arenicellales   349 359 388 391
Eukaryota;Archaeplastida;Chloroplastida;Charophyta  22  28  10  8
Bacteria;Proteobacteria;Alphaproteobacteria;uncultured  3   1   1   2
Bacteria;Proteobacteria;Alphaproteobacteria;Rhodospirillales    8   10  0   0
evilvenom commented 2 years ago

Yes, same is the question for me, I am able to visualize the results, but I get a warning, as follows

Loading taxonomy...
Importing ou.fa.mseq...
   [ WARNING ]  The following taxonomy IDs were not found in the local database and were set to root (if they were recently added to NCBI, use updateTaxonomy.sh to update the
                local database): 193679 348768 191100 293606 341638 157772 133753 959704 22158 182756 354612 276985 3386818 246481 331089 644917 192572 3723565 3016928 4200777
                3014082 299356 3487876 300355 193722 113708 3003844 2136916 4244333 605571 193853 783656 369526 3826175 20310 3779415 3574208 528801 349458 3059643 367838
                185428 2932557 2990150 543415 813944 332527 276996 3910237 3670060 113881 3806695 3856408 582026 199700 194734 2824248 173726 347082 818434 362086 708750
                4194837 192774 3060611 772282 146950 289318 291440 145815 1828413 3991007 178629 567263 775976 62969 575484 370104 158373 137209 293937 588557 1122011 146394
                509788 3069023 2996838 261813 203743 913867 208113 528645 333768 14030 351870 1103336 3910244 192684 97000 313521 562503 4039319 190959 2965637 290578 274970
                357187 560336 535200 294453 555555 793646 16032 306704 176954 261887 187780 1844740 3215571 193621 2877003 275693 1019769 193709 193629 585389 727537 334336
                297708 3242183 205763 146301 549871 3086325 187569 291794 1017556 176183 4290974 3579908 4094259 25842 3720452 194254 3272632 2935919 584484 13966 336863
                113956 3275934 358755 181850 196590 953695 299757 277179 298379 553740 2065799 171408 246042 349348 2987502 745431 336372 318949 24916 15125 324733 524878
                348065 178631 3950691 3015659 181427 544333 199584 1747203 3357820 14024 569522 3530697 180352 294480 297901 4232047 193763 168445 516575 15728 179663 4015948
                515584 302844 845678 417629 800677 3383663 562096 333166 176718 593422 296577 288428 936018 524390 581470 3744858 189407 558357 192865 190938 290824 3979106
                293360 315978 174943 3926480 291894 181121 150169 4261543 276682 562797 2944291 147470 3568712 186674 3055954 185961 191555 344133 320659 2365946 3811416 40124
                845612 3435564 351927 4154110 304221 1504042 562290 174738 193844 573390 3232988 277489 528350 268604 521784 3613745 3756485 369637 256874 302787 422148 107605
                3354673 235212 777794 211654 797762 2700497 583198 194761 346657 4289858 355533 319218 365364 276572 591354 274893 174611 19611 191651 620290 365958 274980
                537219 183604 591118 178015 148016 195252 176957 557978 469873 2075910 562157 777944 19521 525421 804071 176485 3924208 512919 211525 193865 197204 185868
                527468 2177668 25054 3441309 194559 2700197 328816 276044 329420 207053 193765 560141 40649 332225 3275562 524587 275984 296573 193573 804916 2283112 9701
                187747 245086 3222755 187429 187882 580090 3257594 567075 351928 576045 3280779 4261177 367889 588368 316032 287978 1047715 535481 179662 775631 570560 3745352
                365426 2617123 31573 2056702 3950693 362389 4091927 290116 204952 566285 3910240 187782 370041 213487 215738 3299921 525822 738788 296374 1109709 2818524
                210074 177749 124369 177150 233659 289886 122160 4102199 176862 297363 2963281 319708 294852 181949 237134 275472 845601 327931 261458 363264 259549 16549
                276580 145728 308322 3948375 214469 248126 3653227 514650 357741 760177 193831 15736 13968 179905 570243 212372 53032 4045886 590980 180468 2971525 3528448
                593377 16073 583256 163732 276386 186775 3117491 149286 3910247 582154 192779 198598 581252 196083 3746307 583848 2901965 196246 436901 294501 3574065 186561
                190066 949294 348406 522959 567816 341944 216593 294908 304834 524848 539732 330179 178462 4088118 325213 336513 189390 211958 366229 211542 212215 786081
                3949313 360268 3251731 2554809 267787 578623 3749784 176726 317065 2486734 578152 739971 206980 183885 591644 3880254 191276 3839362 573189 3747458 583656
                3907191 289867 180552 190124 275425 3754778 2874609 365997 213827 3272759 4144206 174831 71027 3327894 3326649 3648884 3301069 341724 315100
Writing taxonomy.krona.html...

I downloaded the latest taxdump from ncbi just to confirm if I am going wrong somewhere, but still many IDs were still missing. Please help me with this regards if anyone can!

cc: @sarah872 @ondovb