cruizperez / MicrobeAnnotator

Pipeline for metabolic annotation of microbial genomes
Artistic License 2.0
133 stars 27 forks source link

How is completeness calculated? #90

Open Sidduppal opened 11 months ago

Sidduppal commented 11 months ago

Hey @cruizperez and @rotheconrad, thanks for this amazing tool. I have been using ko_mapper.py to calculate the completeness of modules. I was taking a deep dive into one of the modules and wasn't sure how you're calculating the completeness. For example for module M00551 it has a definition - K05549+K05550+K05784 K05783. According to me the completeness should be a fraction of 4 ie. if one gene is present it's 25% complete, if 3 genes then 75%, etc. However, for the same module ko_mapper.py is giving the completeness as a fraction of 6. The percentage are 16.67%, 33.34%, etc. I'm not sure what's happening. Any help will be appreciated. Thanks!