GreenGenes v13.5 Bacteroides representatives (at 97%) classified using RDP classifier: https://gist.github.com/audy/11232187 using the RDP database as the training dataset.
[ ] Classify all GreenGenes sequences using RDP classifier or Quikr trained on the RDP Database. (quikr isn't the tool for the job).
[x] Classify all GG sequences using Kraken to RDP database
[ ] Construct new taxonomy table (in progress: generating taxid -> lineages file, then make gg_id -> tax_id table) from Kraken output.
Kind of important.
Apparently the GreenGenes taxonomy is incorrect for a certain group of Bacteria that are important in this study.
Apparently a lot representative sequences classified as Bacteroides actually belong to different families Prevotella and Porphyromadaceae (see https://groups.google.com/forum/#!topic/qiime-forum/tf-gdsLRM-w).
GreenGenes v13.5 Bacteroides representatives (at 97%) classified using RDP classifier: https://gist.github.com/audy/11232187 using the RDP database as the training dataset.
Classify all GreenGenes sequences using RDP classifier or Quikr trained on the RDP Database.(quikr isn't the tool for the job).taxid
-> lineages file, then makegg_id
->tax_id
table) from Kraken output.