KwanLab / Autometa

Autometa: Automated Extraction of Genomes from Shotgun Metagenomes
https://autometa.readthedocs.io
Other
40 stars 15 forks source link

Confidence value for taxon assignment #48

Open evanroyrees opened 4 years ago

evanroyrees commented 4 years ago

This is a requested feature for the taxon assignment phase of the autometa pipeline (Thanks @tderond for bringing this to my attention).

This would involve outputting an additional column or columns with a confidence interval or significance value or something of the like, to suggest to the end user how confident they should be in the assignment of any particular contig to its respective taxon ID.

This could involve aggregating the numbers from the majority_vote.py section which looks through the LCA assignments before resolving to a taxid. This majority could output a number corresponding to the number of ORFs in agreement vs. in disagreement as well as the number of ORFs sharing a common ancestor vs. not.

jason-c-kwan commented 3 years ago

This is sort of addressed in the NSF proposal. So while this is a good suggestion, it might be superseded by our new planned features.