opencobra / cobrapy

COBRApy is a package for constraint-based modeling of metabolic networks.
http://opencobra.github.io/cobrapy/
GNU General Public License v2.0
461 stars 216 forks source link

Model valid but cannot be loaded #1343

Closed HASaunders closed 1 year ago

HASaunders commented 1 year ago

Checklist

Question

I would like to load the following model: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4081344/bin/supp_pp.114.235358_235358S2._Arabidopsis_core_model_in_Systems_Biology_MarkUp_Language.zip

When running:

model_path = "ArabidopsisCoreModel.xml"
model = cobra.io.read_sbml_model(model_path)

I get the following error:

*** cobra.io.sbml.CobraSBMLError: Something went wrong reading the SBML model. Most likely the SBML model is not valid. Please check that your model is valid using the `cobra.io.sbml.validate_sbml_model` function or via the online validator at https://sbml.org/validator_servlet/ .
        `(model, errors) = validate_sbml_model(filename)`
If the model is valid and cannot be read please open an issue at https://github.com/opencobra/cobrapy/issues .

When running the model through https://sbml.org/validator_servlet/ some warnings are raised but the model is considered valid.

Please could you let me know why I am unable to load the model?

Thank you in advance!

cdiener commented 1 year ago

Hi, it looks like the model has slightly odd GPR that trips up the parser:

Use of GENE ASSOCIATION or GENE_ASSOCIATION in the notes element is discouraged, use fbc:gpr instead: <Reaction R_PSII_h "photosystem II">
Uppercase AND/OR found in rule '2*(ATCG00020 AND ATCG00680 AND ATCG00280 AND ATCG00270 AND ATCG00580 AND ATCG00570 AND ATCG00710 AND ATCG00080 AND ATCG00550 AND ATCG00070 AND ATCG00560 AND ATCG00220 AND ATCG00700 AND (AT5G66570 OR AT3G50820) AND AT1G06680 AND (AT4G21280 OR AT4G05180) AND AT1G79040 AND AT1G44575 AND ATCG00690 AND AT3G21055 AND AT2G30570 AND AT2G06520 AND AT1G67740 AND ATCG00300)'.
invalid syntax. Perhaps you forgot a comma?

The issue here is the 2* which is not valid GPR part. The validator did not complain because the GPRs are stored in the notes field (which is not recommended) and those are always skipped in validation. COBRAPY will still try to parse it and that's where the issue occurs. There are unfortunately many of those occurrences in the model, so it will be a bit hard to fix.

cdiener commented 1 year ago

Closing for now because that is an issue of invalid GPRs. Feel free to reopen if you believe that is not accurate.