Open GwennyGit opened 2 years ago
Further required improvements:
In a recent discussion we decided that adding SBO terms merely from the local database might be insufficient due to future SBO term changes. I found a client tool that allows programmatic access to the SBO terms with Python. At the moment I has some problems with using it and opened an issue at the repository but when that is resolved this seems to be a good way of keeping everything up-to-date and maybe even avoid a local database in the future.
In the SBOannotator some changes to the original code of Elisabeth Fritze were implemented by @NantiaL . It seems that updating this SBOAnn version in refineGEMs is necessary. I started with a transfer of all new functions. At the moment they are not active in the main function which would be the next step. One thing seemed a bit odd for me: the new function
def handleMultipleECs(react, ECNums):
# if no EC number annotated in model
if len(ECNums) == 0:
react.setSBOTerm('SBO:0000176')
else:
# store first digits of all annotated EC numbers
lst = []
for ec in ECNums:
lst.append(ec.split(".")[0])
# if ec numbers are from different enzyme classes, based on first digit
# no ambiguous classification possible
if len(set(lst)) > 1:
react.setSBOTerm("SBO:0000176") # metabolic rxn
# if ec numbers are from the same enzyme classes,
# assign parent SBO term based on first digit in EC number
else:
# Oxidoreductases
if "1" in set(lst):
react.setSBOTerm("SBO:0000200")
# Transferase
elif "2" in set(lst):
react.setSBOTerm("SBO:0000402")
# Hydrolases
elif "3" in set(lst):
react.setSBOTerm("SBO:0000376")
# Lyases
elif "4" in set(lst):
react.setSBOTerm("SBO:0000211")
# Isomerases
elif "5" in set(lst):
react.setSBOTerm("SBO:0000377")
# Ligases, proper SBO is missing from graph --> use one for modification of covalent bonds
elif "6" in set(lst):
react.setSBOTerm("SBO:0000182")
# Translocases
elif "7" in set(lst):
react.setSBOTerm("SBO:0000185")
# Metabolic reactions
else:
react.setSBOTerm("SBO:0000176")
seems to be less specific than the removed functions like
def checkMethylationViaEC(reac):
"""tests if reac is methylation by its EC-Code and sets SBO Term if true
Args:
reac (libsbml-reaction): libsbml reaction from sbml model
"""
if len(getECNums(reac)) == 1:
if getECNums(reac)[0].startswith('2.1.1'):
reac.setSBOTerm('SBO:0000214')
def checkTransaminationViaEC(reac):
"""tests if reac is transamination by its EC-Code and sets SBO Term if true
Args:
reac (libsbml-reaction): libsbml reaction from sbml model
"""
if len(getECNums(reac)) == 1:
if getECNums(reac)[0].startswith('2.6.1'):
reac.setSBOTerm('SBO:0000403')
Since I wrote neither the new version nor the older functions I think this needs to be discussed. Maye @NantiaL can help with this!
There is also a new function in SBOAnnotator which is called call_for_EC_annotation
which automatically adds EC to reactions with BIGG identifiers. That would also be a good addition to polish
. We need to discuss this maybe in issue draeger-lab/refinegems#58.
We need to discuss this further. Merging the branch for now so that we can work on the io
module.
Here is an implementation for an OBO parser from the BioPython project that could be used to check the relationships between SBO terms. The latest OBO file for SBO can be obtained from GitHub. Please take a look at a similar Java implementation for SBO.
The extensions of the SBO database as mentioned in the Tasks for this issue should be added to the databases
module as this module from now on handles all database-related functions. (See issue draeger-lab/refinegems#49 for more details on databases
.)
The client that @famosab mentioned in comment https://github.com/draeger-lab/SBOannotator/issues/1 can now be used as the issue was resolved. So for now we know how to get new SBO terms for sboann
. The next step to keep sboann
up-to-date would be to determine a way to automatically map identifiers to the SBO terms. For EC numbers that might be easier than for other identifiers like BiGG IDs. However, even for EC numbers it would be important to establish a mapping rule, e.g. Use the first number in the EC number to assign the SBO term or something similar.
Added a function in util.py that rewrites the well-annotated SBOterms into lower tier that memote accepts to "fix" the memote score.
Currently only "fixes" biochem reactions.
Feature: The database in the program
SBOannotator
could be updated automatically to ensure that the user always gets the newest SBO annotations for his model(s).Possible Implementation:
INSERT INTO
commands and corresponding rows) and automatically adds entries viaINSERT INTO
(Function B) → This function should extract all SBO terms, get the corresponding BiGG and EC IDs and put this all together to get a new 'updated'data.db
file.data.sql
file (Function B) and updatedata.db
(→ initialise_database())Supplement The SBO repository might help in accessing the new SBO terms.