DerrickWood / kraken2

The second version of the Kraken taxonomic sequence classification system
MIT License
702 stars 269 forks source link

Classify 16S reads with NCBI's 16S database #424

Open LunavdL opened 3 years ago

LunavdL commented 3 years ago

Hi,

I have been using Kraken2 to classify my 16S reads with the Silva and Greengenes databases provided, which works great. I would like to compare the results to NCBI's standard 16S database as well.

I downloaded the taxonomy using: kraken2-build --download-taxonomy --db 16S_ribosomal_RNA

However, I am unsure how to proceed with adding the actual sequences. Downloading the full genomes of bacteria kraken2-build --download-library bacteria --db $DBNAME for example feels a bit overkill, as I only need the 16S fragments. Should I download all fasta files from the website and add those the the library manually?

Thanks, Luna

Piplopp commented 3 years ago

You can download the 16S bits through the TargetedLoci project on the NCBI FTP:

https://ftp://ftp.ncbi.nlm.nih.gov/refseq/TargetedLoci/Bacteria/

Then you can do something like this (according to the documentation here):

kraken2-build --add-to-library yourfile.fna --db 16S_ribosomal_RNA
kraken2-build --build --db 16S_ribosomal_RNA
magibc commented 2 years ago

Hi @Piplopp and @LunavdL,

I'm trying to do the same as you but the multifasta bacteria.16SrRNA.fna.gz downloaded from https://ftp://ftp.ncbi.nlm.nih.gov/refseq/TargetedLoci/Bacteria/ it is not necessary to rename each >header with kraken:taxid?

In my Kraken version (Kraken2, 2.0.8.beta) the command is executed without error but my kraken library is not enriched.

I used the following commands:

kraken2-build --add-to-library acteria.16SrRNA.fna --db 16S_ribosomal_RNA

Also the tutorial from the webpage also not work for me: https://rsh249.github.io/bioinformatics/metagenomics2.html The library is not enlarged with this added fasta of tutorial...

Thanks on advance for your help/hints.

Magí.