SysBioChalmers / yeast-GEM

The consensus GEM for Saccharomyces cerevisiae
http://sysbiochalmers.github.io/yeast-GEM/
Creative Commons Attribution 4.0 International
96 stars 46 forks source link

Feat-stoichiometry coefficients for subunits in the enzyme complex #297

Open feiranl opened 2 years ago

feiranl commented 2 years ago

Description of the issue:

hongzhonglu commented 2 years ago

Maybe firstly store such information in a tsv file firstly? @feiranl @edkerk

edkerk commented 2 years ago

That's a good idea. It's not that straightforward where this information should be stored. In the long run it should ideally be kept in the model file, and not just as a separate TSV. But this raises three issues:

This is not of the highest priority, but would be good to start a discussion on how to overcome especially the first two issues.

hongzhonglu commented 2 years ago

After we have the curated datasets. we can put it under this folder https://github.com/SysBioChalmers/yeast-GEM/tree/main/data/databases?

mihai-sysbio commented 2 years ago
  • Stoichiometry coefficients are missing in current GEM but available in Complex Portal and PDB database.

I'm really unsure about which coefficients this discussion is about, could you give an example?

edkerk commented 2 years ago

The number of protein subunits that make up a protein complex. It is not always just 1 copy of each subunit. See for instance pyruvate dehydrogenase: https://www.ebi.ac.uk/complexportal/complex/CPX-3207:

image

This would for instance make a difference if the model would be turned into an ec-model, but there are likely also other use cases.

mihai-sysbio commented 2 years ago

This would for instance make a difference if the model would be turned into an ec-model, but there are likely also other use cases.

Thanks for the example @edkerk. Wouldn't then this be more useful to be dealt with by a future version of GECKO instead? I'm thinking that it would make more sense for this information to be following the same structure regardless of the model.

edkerk commented 2 years ago

But that is if GECKO is the only purpose. GECKO should be modified to be able to deal with such information anyway (it currently assumes 1 copy per subunit), but should that information only be provided in GECKO, or distributed as part of the generic yeast-GEM, also ready for other applications?

hongzhonglu commented 2 years ago

generic yeast-GEM, also ready for other applications

Seems there no direct applications of this information in yeast-GEM?

mihai-sysbio commented 2 years ago

I'm very hesitant to "copy" external data in any repository, unless it would be useful for a good chunk of the userbase.

also ready for other applications?

I see (but I don't know how often this would be the case). Perhaps what can be stored in this repository is a script that fetches the data via an API, and maybe does some reformatting, but not the data itself, since that will by default get stale so it requires work to keep up to date.

feiranl commented 2 years ago

I agree with Mihail. But here it is not just "copy". We get this stoi info from ComplexPortal and information from PDB database. The script extracts protein structures from PDB database, mapping the subunit through sequence alignment and then find the stoi.