opencobra / cobrapy

COBRApy is a package for constraint-based modeling of metabolic networks.
http://opencobra.github.io/cobrapy/
GNU General Public License v2.0
463 stars 217 forks source link

MAT loading error from spaces in compartment names #919

Open zakandrewking opened 5 years ago

zakandrewking commented 5 years ago

Problem description

In BiGG Models, we have some compartment names with spaces in them:

http://bigg.ucsd.edu/models/iCHOv1/metabolites/h_im

COBRApy (v0.17) can read JSON and SBML files generated with this data. But if one tries to round-trip to a .mat file, an error occurs:

In [7]: m = cobra.io.load_json_model('bigg/iCHOv1.json')
In [8]: cobra.io.save_matlab_model(m, 'iCHOv1.mat')
In [9]: m2 = cobra.io.load_matlab_model('iCHOv1.mat')

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-9-02bf596db094> in <module>
----> 1 m2 = cobra.io.load_matlab_model('iCHOv1.mat')

/usr/local/lib/python3.7/site-packages/cobra/io/mat.py in load_matlab_model(infile_path, variable_name, inf)
     81     if variable_name is not None:
     82         return from_mat_struct(data[variable_name], model_id=variable_name,
---> 83                                inf=inf)
     84     for possible_name in possible_names:
     85         try:

/usr/local/lib/python3.7/site-packages/cobra/io/mat.py in from_mat_struct(mat_struct, model_id, inf)
    225         except (IndexError, ValueError):
    226             pass
--> 227         model.add_metabolites([new_metabolite])
    228     new_reactions = []
    229     coefficients = {}

/usr/local/lib/python3.7/site-packages/cobra/core/model.py in add_metabolites(self, metabolite_list)
    450             if met.id not in self.constraints:
    451                 constraint = self.problem.Constraint(
--> 452                     Zero, name=met.id, lb=0, ub=0)
    453                 to_add += [constraint]
    454

/usr/local/lib/python3.7/site-packages/optlang/glpk_interface.py in __init__(self, expression, sloppy, *args, **kwargs)
    162     def __init__(self, expression, sloppy=False, *args, **kwargs):
    163         _glpk_validate_id(kwargs.get("name", "GoodName"))
--> 164         super(Constraint, self).__init__(expression, sloppy=sloppy, *args, **kwargs)
    165         if not sloppy:
    166             if not self.is_Linear:

/usr/local/lib/python3.7/site-packages/optlang/interface.py in __init__(self, expression, lb, ub, indicator_variable, active_when, *args, **kwargs)
    675         self.lb = lb
    676         self.ub = ub
--> 677         super(Constraint, self).__init__(expression, *args, **kwargs)
    678         self.__check_valid_indicator_variable(indicator_variable)
    679         self.__check_valid_active_when(active_when)

/usr/local/lib/python3.7/site-packages/optlang/interface.py in __init__(self, expression, name, problem, sloppy, *args, **kwargs)
    414             name = str(name)
    415
--> 416         self._validate_optimization_expression_name(name)
    417
    418         super(OptimizationExpression, self).__init__(*args, **kwargs)

/usr/local/lib/python3.7/site-packages/optlang/interface.py in _validate_optimization_expression_name(name)
    385                 raise ValueError(
    386                     'Names cannot contain whitespace characters. "%s" contains whitespace character "%s".' % (
--> 387                         name, char)
    388                 )
    389

ValueError: Names cannot contain whitespace characters. "h_im[intermembrane space of mitochondria]" contains whitespace character " ".

Dependency Information

System Information ================== OS Darwin OS-release 18.7.0 Python 3.7.4 Package Versions ================ cobra 0.17.0 depinfo 1.5.1 future 0.17.1 numpy 1.17.2 optlang 1.4.4 pandas 0.25.1 pip 19.1.1 python-libsbml-experimental 5.18.0 ruamel.yaml 0.16.5 setuptools 33.1.1 six 1.12.0 swiglpk 4.65.0 wheel 0.33.4
cdiener commented 5 years ago

So cobrapy does not allow for spaces in IDs but does in names (description). Do you know if matlab models make a distinction between id and description/name? Looks like they don't and this is why the matlab reader uses the names as the IDs. There is basically two options:

  1. raise an error on read due to invalid IDs
  2. make the names compatible (for instance substituting underscores for spaces) and log that occurrence

It seems like 2 might be what you prefer but has some edge cases (what if my compartment and my_compartment are both in the model?

zakandrewking commented 5 years ago

I'm not familiar enough with the needs in MATLAB / COBRA Toolbox to provide much direction here.

If necessary, we can restrict compartment names in BiGG to not include spaces, but that could be confusing going forward since other descriptive names allow spaces.

zoey-rw commented 2 years ago

I believe I'm facing a similar error from loading a .mat file with a space in a variable name, rather than a compartment name. If there are any short-term workarounds I'd try them out - any ideas?

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-9-6e190386ff75> in <module>
----> 1 iJN = cobra.io.load_matlab_model(join(model_dir, "IndiMeSH/Cb_iJN746_rGEM.mat"))

~/.local/lib/python3.8/site-packages/cobra/io/mat.py in load_matlab_model(infile_path, variable_name, inf)
     81             variable_name = possible_names[0]
     82     if variable_name is not None:
---> 83         return from_mat_struct(data[variable_name], model_id=variable_name, inf=inf)
     84     for possible_name in possible_names:
     85         try:

~/.local/lib/python3.8/site-packages/cobra/io/mat.py in from_mat_struct(mat_struct, model_id, inf)
    251             pass
    252         new_reactions.append(new_reaction)
--> 253     model.add_reactions(new_reactions)
    254     set_objective(model, coefficients)
    255     coo = scipy_sparse.coo_matrix(m["S"][0, 0])

~/.local/lib/python3.8/site-packages/cobra/core/model.py in add_reactions(self, reaction_list)
    708 
    709         # from cameo ...
--> 710         self._populate_solver(pruned)
    711 
    712     def remove_reactions(self, reactions, remove_orphans=False):

~/.local/lib/python3.8/site-packages/cobra/core/model.py in _populate_solver(self, reaction_list, metabolite_list)
   1002         for reaction in reaction_list:
   1003             if reaction.id not in self.variables:
-> 1004                 forward_variable = self.problem.Variable(reaction.id)
   1005                 reverse_variable = self.problem.Variable(reaction.reverse_id)
   1006                 self.add_cons_vars([forward_variable, reverse_variable])

~/.local/lib/python3.8/site-packages/optlang/symbolics.py in __new__(cls, name, **kwargs)
    127             obj = sympy.Symbol.__new__(cls, str(uuid.uuid1()))
    128 
--> 129             obj.name = name
    130             obj._assumptions = FactKB(_assume_rules)
    131             obj._assumptions._tell('commutative', True)

~/.local/lib/python3.8/site-packages/optlang/gurobi_interface.py in name(self, value)
    170     def name(self, value):
    171         internal_var = self._internal_variable
--> 172         super(Variable, Variable).name.fset(self, value)
    173         if internal_var is not None:
    174             internal_var.setAttr('VarName', value)

~/.local/lib/python3.8/site-packages/optlang/interface.py in name(self, value)
    193     @name.setter
    194     def name(self, value):
--> 195         self.__validate_variable_name(value)
    196         old_name = getattr(self, 'name', None)
    197         self._name = value

~/.local/lib/python3.8/site-packages/optlang/interface.py in __validate_variable_name(name)
    147         for char in name:
    148             if char.isspace():
--> 149                 raise ValueError(
    150                     'Variable names cannot contain whitespace characters. "%s" contains whitespace character "%s".' % (
    151                         name, char)

ValueError: Variable names cannot contain whitespace characters. "clpn120 transport" contains whitespace character " ".
akaviaLab commented 2 years ago

One question about that, maybe need @matthiaskoenig input. How does SBML standard deal with spaces in various places? Could we do the same? I doubt SBML would allow spaces in variable names.

cdiener commented 2 years ago

No it does not allow that. So that should throw an error or be fixed with a warning.

akaviaLab commented 2 years ago

would SBML allow spaces if they are escaped like "\ " or not that either?

On Thu, Apr 7, 2022 at 2:04 PM Christian Diener @.***> wrote:

No it does not. So that should throw an error or be fixed with a warning.

— Reply to this email directly, view it on GitHub https://github.com/opencobra/cobrapy/issues/919#issuecomment-1092049883, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACQYYZQPITTALLRHYQHRL6DVD4PSJANCNFSM4JFIXAUQ . You are receiving this because you commented.Message ID: @.***>

cdiener commented 2 years ago

Nope. Tools usually sub with ASCII codes like "first32second". But subbing with underscores might be okay.