mmundy42 / cobrababel

Other
6 stars 3 forks source link

MetaNetX metabolites numeric fields #2

Closed uludag closed 7 years ago

uludag commented 7 years ago

Calling cobra.io.write_sbml_model with models generated by cobrababel.create_metanetx_universal_model() returns the following error:

`TypeError: in method 'Species_setCharge', argument 2 of type 'int'`

This error looks repaired if we type cast the charge field value to int

        charge = fields[field_names['Charge']]
        charge = int(charge) if len(charge) > 0 and charge != "NA" else None

mass field is also numeric:

        mass = fields[field_names['Mass']]
        mass = float(mass) if len(mass) > 0 else None

--mahmut

mmundy42 commented 7 years ago

I'm working on adding a test to validate this problem is solved and I'm not able to reproduce the TypeError. Can you provide sample code that shows how to reproduce the problem and also let me know what versions of cobra, lxml, and python-libsbml you have installed.

I was able to write the MetaNetX universal model although validate_sbml_model() detected a problem.

>>> mnx = cobrababel.create_metanetx_universal_model()
>>> cobra.io.write_sbml_model(mnx, 'cobrababel/test/mnx.xml')
>>> cobra.io.sbml3.validate_sbml_model('cobrababel/test/mnx.xml')
(None, {'validator': [], 'other': ["invalid literal for int() with base 10: 'NA'"], 'SBML errors': [], 'warnings': []})

I'm using cobra 0.8.1, lxml 3.8.0, and python-libsbml 5.15.0.

uludag commented 7 years ago

I was using the same code you used except I was setting the use_fbc_package option to False, as in the following example

    cobra.io.write_sbml_model(mnx, "metanetx.xml", use_fbc_package=False)

If I do not set the use_fbc_package option then my write_sbml_model calls return without error.

I have the same versions of cobra and lxml packages, except the python-libsbml package is slightly earlier version (5.13.0)

uludag commented 7 years ago

If I have the numeric type casting for charge attribute I do not get the invalid literal for int() with base 10: 'NA' error. However I get not alphanumeric warnings for metabolites with parenthesis in their formula strings.

COBRApy Formula class docs notes "a legal formula string contains only letters and numbers":

http://cobrapy.readthedocs.io/en/latest/_modules/cobra/core/formula.html

mmundy42 commented 7 years ago

Yes, I just found the not alphanumeric warnings too. Other biochemistry databases have the same issue (for example, ModelSEED). I haven't found a good solution so just leave the formula as provided by the source system. Other side effect is that cobrapy can't determine mass balance of reactions that use the metabolite. Any thoughts?

uludag commented 7 years ago

My knowledge of biochemistry/chemistry is limited. I could not make any comments at this time.

I have seen the warning earlier when I was using another Python package (cameo), the function I was calling in turn was making a cobrapy call which was returning invalid formula (has parenthesis) error.

I know a colleague in MetaNetX team, I will try asking their comments, if I hear anything I will share here.

mmundy42 commented 7 years ago

I ran into some problems with python-libsbml 5.15.0 that slowed me down today. I reverted back to 5.13.0 and made progress. I updated creating a MetaNetX Metabolite to the following:

        metabolite = Metabolite(id=fields[field_names['MNX_ID']],
                                name=fields[field_names['Description']],
                                formula=fields[field_names['Formula']])
        charge = fields[field_names['Charge']]
        metabolite.charge = int(charge) if len(charge) > 0 and charge != 'NA' else None
        mass = fields[field_names['Mass']]
        if len(mass) > 0:
            metabolite.notes['mass'] = float(mass)
        metabolite.notes['InChI'] = fields[field_names['InChI']] \
            if len(fields[field_names['InChI']]) > 0 else 'NA'
        metabolite.notes['SMILES'] = fields[field_names['SMILES']] \
            if len(fields[field_names['SMILES']]) > 0 else 'NA'
        metabolite.notes['source'] = fields[field_names['Source']] \
            if len(fields[field_names['Source']]) > 0 else 'NA'
        metabolite.notes['InChIKey'] = fields[field_names['InChIKey']] \
            if len(fields[field_names['InChIKey']]) > 0 else 'NA'

For the "mass" note it makes sense to me to only include it if there is a valid value. This is similar to the charge field but the difference is that mass is not an attribute of a Metabolite object. For the other notes, there is some inconsistency in the MetaNetX file when the value is unknown. Sometimes the field is left blank and sometimes it is set to "NA". This makes the value "NA" in both cases.

Let me know if that seems reasonable and I'll finish the updates and push out a new release.

uludag commented 7 years ago

All looks fine, after updating metanetx.py with above changes my write_sbml_model calls does not generate any errors. --mahmut