SysBioChalmers / Human-GEM

The generic genome-scale metabolic model of Homo sapiens
https://sysbiochalmers.github.io/Human-GEM-guide/
Creative Commons Attribution 4.0 International
96 stars 40 forks source link

Accessing protein isoforms within Human1 #804

Closed nnursimulu closed 8 months ago

nnursimulu commented 8 months ago

Dear Human-GEM Team,

I am interested in incorporating information about protein isoforms in Human1. Can you please tell me how I can access links between reaction and protein isoforms?

Thank you very much.

Best, Nirvana

haowang-bioinfo commented 8 months ago

Dear @nnursimulu

Thanks for your interests in Human-GEM, to which isoform information has not been included so far.

Could you please specify the intention, wether you want to add isoform info to Human-GEM or to access it?

Best wishes Hao

nnursimulu commented 8 months ago

Dear Hao,

Thank you for your response. I am actually looking at the Human1 2020 publication in Science Signaling, where there is mention that there are transcript and protein-reaction rules to facilitate integration of such transcript information from "Framework and resource for more than 11,000 gene transcript-protein-reaction associations in human metabolism" (Ryu et al, PNAS 2017). The latter paper appears to account to different protein isoforms.

Could you please clarify how the Human GEMs are contextualized when there are isoforms?

Thank you very much.

Best regards, Nirvana

JonathanRob commented 8 months ago

Great question, Nirvana.

To enable integration of with different protein isoforms (or transcript splice variants), we have provided columns in the genes.tsv document which contain mappings to proteins (ENSP IDs) or transcripts (ENST IDs) corresponding to each of the genes in the model.

The model contains a mapping from reactions to genes in the grRules field. By default, the genes are in the form of Ensembl IDs. One can translate the model to a different gene ID type (e.g., ENSP IDs) using the translateGrRules script in this repo. This will convert the genes and grRules fields such that the model is now using the new target identifier. The model can then be integrated with data using the target ID type in a GEM contextualization algorithm (e.g., tINIT), to determine which reactions to keep based on the gene or protein expression levels.

Note that the translateGrRules function converts a reaction-gene rules by treating isoforms as isozymes. For example, if geneA maps to proteinA1 and proteinA2, and geneB maps to proteinB1, proteinB2, and proteinB3, then the below grRule would be translated as:

Original: geneA AND geneB Translated: (protA1 OR protA2) AND (protB1 OR protB2 OR protB3)

Hopefully this helps clarify somewhat!

nnursimulu commented 8 months ago

Thank you very much for clarifying!