cruizperez / MicrobeAnnotator

Pipeline for metabolic annotation of microbial genomes
Artistic License 2.0
133 stars 27 forks source link

Fixes for two critical bugs in KOfam annotation step #95

Closed ivagljiva closed 5 months ago

ivagljiva commented 5 months ago

This PR addresses the bugs described in issue #94.

To ensure the correct bit score threshold is used for distinguishing between strong and weak hits in the hmmer_filter() function, we've moved the code for obtaining the current model's threshold to within the loop that iterates over each hit from the HMMER results. To ensure that a numerical comparison is used to identify the best match to a given gene in the best_match_selector() function, we've added explicit float conversions to the bit scores loaded from the intermediate file of HMM hits.

To confirm that these fixes resolve the bugs, we ran the code from this branch on our test genome from issue #94, Bradyrhizobium manausense BR3351 (NCBI RefSeq GCF_001440035.1). We analyzed the results with the scripts (also provided in issue #94) that count the number of false positives (from bug 1) and incorrect 'best matches' (from bug 2), respectively. In both cases, the number of errors with the fixed code was 0.