opencobra / cobrapy

COBRApy is a package for constraint-based modeling of metabolic networks.
http://opencobra.github.io/cobrapy/
GNU General Public License v2.0
464 stars 218 forks source link

ModelSEED SBML imported model has zero genes #896

Closed RRohani closed 5 years ago

RRohani commented 5 years ago

Apologies if this is actually a trivial error on my part - it's my first time playing around with COBRA / metabolic models.

I am using ModelSEED to generate a SBML model of Vibrio natriegens and then importing it using COBRApy. According to the ModelSEED website the model has 923 genes. However, the imported model has 0 genes. I think perhaps it is an issue with the model encoding or (less likely) the parsing of the file. I have tried looking through the source code but haven't been able to figure out the problem.

My code

import cobra
model = cobra.io.read_sbml_model('1219067.16.smbl')
print(len(model.genes))

Returns 0.

Example reaction encoding gene

<reaction id="R_rxn05221_c0" name="Na_plus_Proline_L_symporter_c0" reversible="true">
<notes>
<html:p>GENE_ASSOCIATION:(fig|1219067.16.peg.3176 or fig|1219067.16.peg.3854 or fig|1219067.16.peg.3103)</html:p>
<html:p>PROTEIN_ASSOCIATION:(fig|1219067.16.peg.3176 or fig|1219067.16.peg.3854 or fig|1219067.16.peg.3103)</html:p>
</notes>
<listOfReactants>
<speciesReference species="M_cpd00129_e0" stoichiometry="1"/>
<speciesReference species="M_cpd00971_e0" stoichiometry="1"/>
</listOfReactants>
<listOfProducts>
<speciesReference species="M_cpd00129_c0" stoichiometry="1"/>
<speciesReference species="M_cpd00971_c0" stoichiometry="1"/>
</listOfProducts>
<kineticLaw>
    <math xmlns="http://www.w3.org/1998/Math/MathML">
            <ci> FLUX_VALUE </ci>
    </math>
    <listOfParameters>
        <parameter id="LOWER_BOUND" value="-1000" name="mmol_per_gDW_per_hr"/>
        <parameter id="UPPER_BOUND" value="1000" name="mmol_per_gDW_per_hr"/>
        <parameter id="OBJECTIVE_COEFFICIENT" value="0"/>
        <parameter id="FLUX_VALUE" value="0.0" name="mmol_per_gDW_per_hr"/>
    </listOfParameters>
</kineticLaw>
</reaction>

Many thanks - and again sorry if it's my mistake

:-)

Midnighter commented 5 years ago

Welcome to cobrapy :slightly_smiling_face: It seems that the SBML exported is rather outdated. Gene associations should no longer be encoded in the <notes> and also the parameters are encoded in a <kineticLaw>. Both should be encoded using the SBML fbc package these days.

So it's not your fault but the SBML is not up to par. There are two alternatives:

  1. Use the successor to ModelSEED KBase to generate your draft model.
  2. Take a look at https://github.com/biosustain/iVnat where our group has done some further curation on a draft from KBase :wink:
RRohani commented 5 years ago

@Midnighter Great thanks so much! Will take a look at both :)

Midnighter commented 5 years ago

This issue looks resolved from my perspective. Please feel free to re-open if this continues to be a problem for you.