SysBioChalmers / yeast-GEM

The consensus GEM for Saccharomyces cerevisiae
http://sysbiochalmers.github.io/yeast-GEM/
Creative Commons Attribution 4.0 International
94 stars 43 forks source link

KEGG reaction annotations #205

Closed handancetin closed 4 years ago

handancetin commented 4 years ago

Description of the issue:

While I was checking the reactions with a confidence score lower than 3, I've found that some of the reactions with KEGG ID's are not the same as shown in KEGG Reactions database. You can see the list below where I separated them with lines for a cleaner look.

Please note that I am not 100% sure about my suggestions. Please treat the list below as questions.

List of Reactions:

I've checked other databases for this reaction and found that MetaCyc and ModelSEED are in agreement with KEGG, on the other hand, UniProt says the enzyme has a binding site for ATP. There is also another model in BiGG (iYL1228) with ATP in the substrates. I wonder that if you put ATP to substrates in order to create a demand for it, or how did you decide.


Note that the reactions are in the opposite directions. Same as the previous one, different reactions exist in different databases: According to MetaCyc, reaction should include ATP. On the other hand, ModelSEED and BiGG say the opposite.


Reaction in the model is annotated with a wrong KEGG ID. It should be R00028 (see below) which has the same reaction and enzyme with the model.


Annotated enzyme has a page in KEGG however reactions are not exactly the same.


r_2116 should be annotated with R00710 (reaction below) as the same as r_0174 (mitochondrial version of r_2116).


Both of these reactions have the KEGG ID R05725 (reaction below). They have the same enzyme and gene annotations. I believe the first one is an unintentional duplicate with a wrong metabolite.


Since Ferricytochrome b5 exists in the metabolites (used in the reactions r_4196 and r_4197) and assuming from the lower ID of the r_0241, the NADP could have been changed to Ferricytochrome b5. I am not sure about the reaction specifically, but it is possible that it might be overlooked when the metabolite added to the model.


A finishing note:

While having a KEGG ID for each reaction is a big plus to fetch data automatically, these small differences can create overlooking problems as the model grows. I have plans to check reactions with a confidence score of 3 individually, so I will let you know soon. Sorry if the lines create mess, my intentions were good.

I hope these can help.

Best regards, Handan

I hereby confirm that I have:

hongzhonglu commented 4 years ago

@HnCetin Thanks a lot. Very nice suggestion! It will be nice to check all the reactions with kegg id in the current model to improve the quality. Do you plan to check reactions with other confidence scores?

handancetin commented 4 years ago

@hongzhonglu I plan to check reactions with KEGG ids for other confidence scores, yet I am not really sure that it is going to be helpful. Unless I have a solution on how to decide which reactions to include in the model (for the reactions where different databases show different substrates), I probably will not continue. So far, I am doing literature searching for the reactions or enzymes for their binding sites. If you have any other suggestions, I will be happy to take them.

hongzhonglu commented 4 years ago

@HnCetin It is possible that a reaction could be in different formula as the related enzyme could have multiple functions. I think in this case, we should refer to the gene function annotation based on the UniProt or SGD.

BenjaSanchez commented 4 years ago

@eiden309 have all issues here been addressed after merging #220?

eiden309 commented 4 years ago

@BenjaSanchez only issues related to KEGG annotations (r_0198, r_0911, r_4245, r_2116, r_4254 and r_4255) were resolved in #220.

These are the changes made after reviewing the comments by @HnCetin and discussing with @edkerk and @feiranl (I will include these in #220 as well): Reactions Notes
r_0198 Edited as suggested - changed model.rxnKEGGID to R00028  
r_0911 With reference to module M00048, there is another pathway R07404 and R07405 which requires ATP as a substrate. R04209 has been removed from r_0911 since it is an incorrect reaction annotation. However, this remains an issue as there is no way of incorporating 2 rxnKEGGIDs (R07404 and R07405) into the model  
r_4245 R09645 removed from the model since it cannot be found in KEGG, as mentioned.  
r_2116 Edited as suggested - changed model.rxnKEGGID to R00710  
r_4254 Changed model.rxnKEGGID to R05724 and model.rxnName to nitric oxide, NADH2:oxygen oxidoreductase  
r_4255 Changed model.rxnName to nitric oxide, NADPH2:oxygen oxidoreductase
The remaining reactions were reviewed but have not been modified (I will amend these next week): Reactions Notes
r_0774 and r_0775 Should include ATP/ADP in the reactions since MNXR101909 and P39683 have also annotated reactions with ATP/ADP
r_0241 Should edit as suggested - change NADPH/NADP(+) to Ferrocytochrome b5/Ferricytochrome b5