silva_nr99_v138_train_set.fa.gz harbours all sequences of Silva NR 99?

benjjneb / dada2

Accurate sample inference from amplicon data with single nucleotide resolution

GNU Lesser General Public License v3.0

464 stars 142 forks source link

Dear Benjamin,

first of all: thanks a lot for providing the DADA2 tutorial for assigning 16S rRNA based taxonomy, I learned a lot and appreciate your work very much.

I have one question regarding the silva_nr99_v138_train_set.fa.gz file which is available on Zenodo. Does this file contain all microorganisms that are availabe in the current Silva V138 Nr99, or did you remove some sequences for having a smaller size? I am asking because I have around 20 % of sequences that can not be assigned to Genus level, and I was wondering if this is happening because of the unknown microbes inside my sample or the Train_set file? And as a second question, would you say that the NR file of Silva in general is fine for performing analysis of microbial communities or should I use the Parc file of Silva?

BTW, I used a Pacbio 16S-full length sequencing approach.

Thanks again a lot for all your work!

Best regards,

Marcel

benjjneb / dada2

silva_nr99_v138_train_set.fa.gz harbours all sequences of Silva NR 99? #1162