#!/usr/bin/env python
"""
# Original Source:
# https://github.com/cruizperez/MicrobeAnnotator/blob/master/microbeannotator/pipeline/ko_mapper.py
# Forked on 2023.10.18
If you use this software, please cite the original publication:
Ruiz-Perez, C.A., Conrad, R.E. & Konstantinidis, K.T.
MicrobeAnnotator: a user-friendly, comprehensive functional annotation pipeline for microbial genomes.
BMC Bioinformatics 22, 11 (2021). https://doi.org/10.1186/s12859-020-03940-5
########################################################################
# Author: Carlos A. Ruiz Perez
# Email: cruizperez3@gatech.edu
# Intitution: Georgia Institute of Technology
# Version: 1.0.0
# Date: Nov 13, 2020
# Description: Maps protein KO information with their respective modules
# and calculates the completeness percentage of each module present.
########################################################################
"""
################################################################################
"""---0.0 Import Modules---"""
Also in the help menu: Josh L. Espinoza's fork from MicrobeAnnotator. Please cite the following: https://doi.org/10.1186/s12859-020-03940-5
This module here makes a lot of sense to be categorized as bifurcating: M00075
However, this module is also in the bifurcating list and it seems to be fairly linear: M00832
Is there a rule-based scheme used determine module categorization (e.g., regular, bifurcating, or structural)?
I would like to update it with the 481 modules here: https://www.genome.jp/kegg/docs/module_statistics.html
Also, disclaimer that I forked and modified your module completion ratio calculations in my VEBA package (with proper citations of course!): https://github.com/jolespin/veba/blob/9532bee1a5e57dd0968e2e04b21abcaede7fcccf/bin/scripts/module_completion_ratios.py#L4
Prefixed with this in the script:
Also in the help menu:
Josh L. Espinoza's fork from MicrobeAnnotator. Please cite the following: https://doi.org/10.1186/s12859-020-03940-5
And of course in the publication which should be accepted soon: https://www.biorxiv.org/content/10.1101/2024.03.08.583560v2
I'd be happy to continue development on the KEGG module completion but would need some guidance on how you determine the module categories.