bioinformatics-centre / kaiju

Fast taxonomic classification of metagenomic sequencing reads using a protein reference database
http://kaiju.binf.ku.dk
GNU General Public License v3.0
260 stars 68 forks source link

kaiju-makedb multiple taxa? #122

Closed charlottecc closed 5 years ago

charlottecc commented 5 years ago

Hi,

Is it possible to create a database containing the completely assembled and annotated reference genomes of Archaea, Bacteria, and viruses from the NCBI RefSeq database, as well as fungal RefSeq sequences? Or would I have to run my samples through two different databases to classify fungal reads too?

I'm interested in classifying fungal reads to as they are quite prevalent in my samples.

Thank you in advance!

pmenzel commented 5 years ago

Hi, both options are possible. 1) creating a combined reference database needs a bit tinkering: first run kaiju-makedb with the fungi and refseq options, and then manually combine the kaiju_db_*.faa files into one file (using cat), and then run kaiju-mkbwt and kaiju-mkfmi as described here.

2) you can also run your reads through both databases separately and then combine the output file of each using kaiju-mergeOutputs.

charlottecc commented 5 years ago

Thank you very much for your reply, I will try your suggestions!