Open rosebaesj opened 2 years ago
empty -> unassigned 라고 보면 됨 일단 없애서는 안됨 https://forum.qiime2.org/t/how-to-deal-with-unassigned-taxa/2684
https://journals.plos.org/ploscompbiol/article/file?id=10.1371/journal.pcbi.1009581&type=printable
The SILVA database exhibited the highest number of unique sequences (Fig 2B) and species labels among the databases However, ~72% of species labels present in SILVA consist of unidentified, uncultured, or unknown organisms, and ~2.5% (excluding chloroplast and mitochondrial sequences) do not match the genus label, leaving only ~25% of sequences with meaningful species labels. Notably, this is because SILVA only curates the taxonomy to genus level but provides the “organism name” given to the sequence in the NCBI GenBank source data, and hence genus–species mismatches can occur.
The lack of species-rank curation in SILVA leads to poor optimal classification performance at the species level (Fig 3D), yielding a species-level F-measure of 0.73, far below the other 16S rRNA gene databases. By comparison, classification accuracy at the genus level is much higher for SILVA, consistent with the level of curation performed (Fig 3D).
모르게씀 QIIME forum에 질문 올렸음
What are the differences?