SysBioChalmers / Human-GEM

The generic genome-scale metabolic model of Homo sapiens
https://sysbiochalmers.github.io/Human-GEM-guide/
Creative Commons Attribution 4.0 International
98 stars 42 forks source link

Difference Between YAML and SBML Files #681

Closed Devlin-Moyer closed 1 year ago

Devlin-Moyer commented 1 year ago

MAR01044 has two subsystems ("Heme synthesis" and "Porphyrin metabolism") in model/Human-GEM.yml (on both the main and develop branches), but only one subsystem ("Porphyrin metabolism") in model/Human-GEM.xml. It appears to be the only reaction that currently has two subsystems, and that it has had both subsystems since July 2021. I only just noticed it now because I got the latest model/Human-GEM.yml from the develop branch and tried saving it as a .mat file using Cobrapy and got an error about reaction subsystems being lists.

It is very easy for me to work around this, but since MAR01044 appears to be the only reaction with two subsystems, and Cobrapy seems to assume (at least sometimes) that all reactions have exactly one subsystem, I figured y'all might want to change it.

Code to reproduce aforementioned error:

import cobra
sbml_model = cobra.io.read_sbml_model('model/Human-GEM.xml')
[r for r in sbml_model.reactions if len(r.subsystem) != 1] # empty list
cobra.io.save_matlab_model(sbml_model, 'sbml_test.mat') # works

yaml_model = cobra.io.load_yaml_model('model/Human-GEM.yml')
[r for r in yaml_model.reactions if len(r.subsystem) != 1] # list containing the cobra.Reaction object for MAR01044 and no others
cobra.io.save_matlab_model(yaml_model, 'yaml_test.mat') # TypeError: unhashable type: list
Devlin-Moyer commented 1 year ago

Oh on further investigation, it looks like Cobrapy always makes reaction.subsystem a list when reading in a YAML file, even if all reactions only have one subsystem, so the issue with being able to write the model as a .mat file is strictly a Cobrapy issue, but the discrepancy between the different Human-GEM file formats is still a problem independently of that.

mihai-sysbio commented 1 year ago

As you correctly note @Devlin-Moyer, MAR01044 is a special case. Before implementing a fix, it would be good to revisit #356 where the idea of associating a reaction to multiple subsystems has been discussed.

haowang-bioinfo commented 1 year ago

fixed in #703