Closed Marcel2907 closed 3 years ago
Does this file contain all microorganisms that are availabe in the current Silva V138 Nr99, or did you remove some sequences for having a smaller size?
We removed almost all the Eukaryota (save for a random sample of 100 to serve as an outgroup). Essentialy all Bacteria/Archaea entries were kept.
I was wondering if this is happening because of the unknown microbes inside my sample or the Train_set file?
Could be. Another big factor is that a large number of entries only have taxonomy assigned down the genus level, simply because bacterial systematics is still a ways away from naming all the species that exist.
Dear Benjamin,
first of all: thanks a lot for providing the DADA2 tutorial for assigning 16S rRNA based taxonomy, I learned a lot and appreciate your work very much.
I have one question regarding the silva_nr99_v138_train_set.fa.gz file which is available on Zenodo. Does this file contain all microorganisms that are availabe in the current Silva V138 Nr99, or did you remove some sequences for having a smaller size? I am asking because I have around 20 % of sequences that can not be assigned to Genus level, and I was wondering if this is happening because of the unknown microbes inside my sample or the Train_set file? And as a second question, would you say that the NR file of Silva in general is fine for performing analysis of microbial communities or should I use the Parc file of Silva?
BTW, I used a Pacbio 16S-full length sequencing approach.
Thanks again a lot for all your work!
Best regards,
Marcel