wurmlab / sequenceserver

Intuitive graphical web interface for running BLAST bioinformatics tool (i.e. have your own custom NCBI BLAST site!)
https://sequenceserver.com
GNU Affero General Public License v3.0
271 stars 113 forks source link

Taxonomy information #183

Closed yeban closed 8 years ago

yeban commented 9 years ago

Taxonomy data is relevant to many analyses. For example, to infer taxonomical composition of samples in metagenomic studies.

To get taxonomy data from BLAST:

  1. Format database with -taxid option.
  2. Download NCBI's taxonomy database (ftp://ftp.ncbi.nlm.nih.gov/blast/db/taxdb.tar.gz) and put it in pwd (essentially the db should be in the directory from which you run BLAST).
  3. Use -outfmt 6 (tabular) with relevant options to get taxonomy information.

TODOs

yeban commented 9 years ago

I was thinking we could visualise taxonomy information like a tree.

                                            |---- S. invicta   (8)
                     |---- Formicidae (19) -|
                     |                      |---- C. flordanus (11)
Hymenoptera (26) ----|
                     |                 |---- B. terrestris (3)
                     |---- Apidae (7)- |
                                       |---- A. mellifera (4)

It may be possible to construct lineage from species name using taxdump.tar.gz from ftp://ftp.ncbi.nih.gov/pub/taxonomy/

yeban commented 8 years ago

Discussing visualisation in #249.