bioinformatics-centre / kaiju

Fast taxonomic classification of metagenomic sequencing reads using a protein reference database
http://kaiju.binf.ku.dk
GNU General Public License v3.0
260 stars 68 forks source link

Kaiju NR #3

Closed milkbugdoctor closed 8 years ago

milkbugdoctor commented 8 years ago

Is its possible to include all the Eukaryotic sequences in the kaiju database as well when making the database from the NCBI NR fasta file. Currently covertNR seems to get rid of anything that is not prokaryotic or viral. I expect certain metagenomes that I am analyzing to contain fungi and protists (based on Kraken results) , which I hope to classify using your wonderful tool.

pmenzel commented 8 years ago

The last commit had already support for fungi and microbial eukaryotes; it was just not announced on the website. Use option -e for makeDB.sh, which will include all proteins belonging to the taxa listed in the file taxonlist.tsv. I also recommend to use the option -x for filtering low complexity regions.

The webserver now also includes a selection for the NR+euk database.