cruizperez / MicrobeAnnotator

Pipeline for metabolic annotation of microbial genomes
Artistic License 2.0
139 stars 27 forks source link

[Question] How are modules determined to be bifurcating or not? #97

Open jolespin opened 4 months ago

jolespin commented 4 months ago

This module here makes a lot of sense to be categorized as bifurcating: M00075 image

However, this module is also in the bifurcating list and it seems to be fairly linear: M00832 image

Is there a rule-based scheme used determine module categorization (e.g., regular, bifurcating, or structural)?

I would like to update it with the 481 modules here: https://www.genome.jp/kegg/docs/module_statistics.html

Also, disclaimer that I forked and modified your module completion ratio calculations in my VEBA package (with proper citations of course!): https://github.com/jolespin/veba/blob/9532bee1a5e57dd0968e2e04b21abcaede7fcccf/bin/scripts/module_completion_ratios.py#L4

Prefixed with this in the script:

#!/usr/bin/env python

"""
# Original Source: 
# https://github.com/cruizperez/MicrobeAnnotator/blob/master/microbeannotator/pipeline/ko_mapper.py
# Forked on 2023.10.18

If you use this software, please cite the original publication: 
    Ruiz-Perez, C.A., Conrad, R.E. & Konstantinidis, K.T. 
    MicrobeAnnotator: a user-friendly, comprehensive functional annotation pipeline for microbial genomes. 
    BMC Bioinformatics 22, 11 (2021). https://doi.org/10.1186/s12859-020-03940-5

########################################################################
# Author:       Carlos A. Ruiz Perez
# Email:        cruizperez3@gatech.edu
# Intitution:   Georgia Institute of Technology
# Version:      1.0.0
# Date:         Nov 13, 2020

# Description: Maps protein KO information with their respective modules
# and calculates the completeness percentage of each module present.
########################################################################
"""

################################################################################
"""---0.0 Import Modules---"""

Also in the help menu: Josh L. Espinoza's fork from MicrobeAnnotator. Please cite the following: https://doi.org/10.1186/s12859-020-03940-5

And of course in the publication which should be accepted soon: https://www.biorxiv.org/content/10.1101/2024.03.08.583560v2

I'd be happy to continue development on the KEGG module completion but would need some guidance on how you determine the module categories.