AMICI-dev / AMICI

Advanced Multilanguage Interface to CVODES and IDAS
https://amici.readthedocs.io/
Other
108 stars 30 forks source link

request for advice on unit definitions #1323

Closed thomassligon closed 3 years ago

thomassligon commented 3 years ago

I have a large model that produced plausible results in the MATLAB version of AMICI, but steady state did not succeed fully, so preequilibration was done by integrating to 10^5 seconds. All initial conditions and ranges for parameters are based on biologically plausible values (as far as possible). I have been able to fix the problem in Copasi by using a more explicit definition of units (for one substance).

I would like to know if these unit definitions are correct, or at least plausible. My goal is to have correct unit definitions in SBML that determine the behavior in both AMICI and Copasi. I edit the model in Cell Designer so that it is compatible with Minerva, which is also used in the community involved with this model.

Now that the model is running with PETab and pyPESTO, I have serious integration problems in AMICI, so I took a look at integration in COPASI (attachment LAT-old). I can see a number of species that look like unbounded species in the plot, but they are defined with production and degradation, so this should not happen. In working on fixing this problem, I want to review all unit definitions in the model. Previously, I didn’t worry about units in Copasi, but now I want to understand how SBML, Cell Designer, and Copasi define and use unit definitions. In doing so, I have a number of questions I am trying to answer:

I would like to use units of particle number for genes, and maybe for everything else, including proteins. But in any case, the main thing is to know exactly how it is used in Copasi and AMICI.

In Copasi, I have Model / Details / Quantity Unit = mole or # and I want to know whether it appears in SBML or only the Copasi model. I try Copasi export SBML and I see that this changes SBML from

<unitDefinition metaid="substance" id="substance" name="substance">
<listOfUnits>
<unit metaid="CDMT01988" kind="mole"/>
</listOfUnits>

to

      <unitDefinition id="substance" name="substance">
        <listOfUnits>
          <unit kind="item" exponent="1" scale="0" multiplier="1"/>
        </listOfUnits>

With this change, Cell Designer now reports UnitDefinitions substance with units = (1 item)^1 and Copasi recognizes it as Model / Details / Quantity Unit = #.

Will this change how AMICI integrates the model?

In Cell Designer I have UnitDefinitions for substance, volume, area, length, and time, and substance is set to (1 mole)^1.

Should I change the definition of substance, and if so, to what, or should I add a new UnitDefinition?

I have changed the definition of substance as above. I have created a new UnitDefinition for general rates (those with units of s^-1)

<unitDefinition metaid="rateDeg" id="rateGeneral" name="rateGeneral">
<listOfUnits>
<unit metaid="CDMT01993" kind="second" exponent="-1"/>
</listOfUnits>

and applied it to the default RNA degradation rate <parameter metaid="kdegRDef" id="kdegRDef" name="kdegRDef" value="2.8e-005" units="rateGeneral"/> I also applied it to the other degradation rates and the other reactions for LAT, including transcription, translation, transport, and phosphorylation.

For the definition of the gene LAT, in Cell Designer, I have, among other things, quantity type = Amount, substanceUnits = <undefined>, and hasOlnySubstanceUnits = true. In SBML, I currently have <species metaid="s2997" id="s2997" name="LAT" compartment="c5" initialAmount="2" hasOnlySubstanceUnits="true" boundaryCondition="true" charge="0" constant="true"> If I use Cell Designer to change substanceUnits to unit substance, SBML changes to <species metaid="s2997" id="s2997" name="LAT" compartment="c5" initialAmount="2" substanceUnits="substance" hasOnlySubstanceUnits="true" boundaryCondition="true" charge="0" constant="true">

Should I explicitly define substanceUnits for all species?

With these changes applied to LAT, I no longer have something that looks unbounded. The attached plots show LAT gene, Copasi LAT(nucleus), s2997 LAT mRNA nuc, Copasi LAT_2(nucleus), s2998 LAT mRNA mast cell, Copasi LAT_3, s2994 LAT protein, Copasi LAT_2(mast cell), s2795 LAT-P protein, Copasi LAT(mast cell), s2009

I think this is a big improvement. If it looks good to you, I will make these changes everywhere in the model, and also define new rate units for reactions that are more complex than just s^-1.

Lat-old.pdf Lat-1.pdf Lat-2.pdf Lat-3.pdf

matthiaskoenig commented 3 years ago

@ thomassligon Units in SBML are just annotation and do not influence the model behavior in any way. I.e. independent of units integrator will give the same results. Units are mainly useful for unit validation of expressions, e.g. that your rate expressions have the correct units for your species changes and for documentation, i.e. to provide the meta-information what unit is meant. All SBML simulators should give the same numerical results independent of units. The main issue can be that you use SBML features which are not supported by all integrators.

This sounds more like a numerical integration issue with the model. One solution can be to increase the strictness of the absolute and relative tolerance of integration.

In summary: units do not have anything to do with numerical results in SBML, your issue is most likely your model, i.e. the math and expressions in your model.

FFroehlich commented 3 years ago

@ thomassligon Units in SBML are just annotation and do not influence the model behavior in any way. I.e. independent of units integrator will give the same results. Units are mainly useful for unit validation of expressions, e.g. that your rate expressions have the correct units for your species changes and for documentation, i.e. to provide the meta-information what unit is meant. All SBML simulators should give the same numerical results independent of units. The main issue can be that you use SBML features which are not supported by all integrators.

This sounds more like a numerical integration issue with the model. One solution can be to increase the strictness of the absolute and relative tolerance of integration.

In summary: units do not have anything to do with numerical results in SBML, your issue is most likely your model, i.e. the math and expressions in your model.

Correct in principle. What is important though is that you use appropriate units scales:

It looks like your model is has species that have amounts in the order of 1e26, which likely comes from multiplying with the avogadro constant somewhere. This is problematic with the default absolute tolerances (probably something like 1e-8), as they are absurdly restrictive, which likely leads to integration errors. This can be fixed by changing the integrator tolerances, however if your model also includes other amounts that are in the order of lets say 10 particles, adequate integration is unlikely since AMICI does not support setting species specific integration tolerances. Accordingly, changing the highly abundant species to something like [mole] and changing initial concentrations as well as other expressions where that species occur will have a substantial effect on integration performance.

Ideally you want to use units in your model such that initial values are in the order of 1. Being off by 2-3 orders of magnitude won't be a problem, but initial values spanning 10+ orders of magnitude will likely lead to problems during numerical integration.

matthiaskoenig commented 3 years ago

Just to comment on this:

FFroehlich commented 3 years ago

Just to comment on this:

  • one can easily have state variable dependent tolerances in CVODE by passing an absolute and relative tolerance vector

Correct, we just didn't implement support for it.

  • You can still have a ton of issues if you have very large (e.g. bioreactors) or very small (realistic cell volumes). Depending on the used tool state variables on species are either defined in amount or concentrations, so even if it fixes things for AMICI it could break things in other integrators which define state variables differently. So you have to set your initial values of species (and dynamic compartments and parameters) according to the tool encoding. I.e. if AMICI is handling things in amounts internally your initialAmounts have to be within 2-3 orders of magnitude, if AMICI is handling state variables as concentrations your inititialConcentrations have to be within 2-3 order of magnitude.

AMICI is simulating amounts for everything that has the HasOnlySubstanceUnits set to True and concentrations for everything else, as documented in https://amici.readthedocs.io/en/latest/generated/amici.sbml_import.SbmlImporter.html#amici.sbml_import.SbmlImporter.sbml2amici

thomassligon commented 3 years ago

Thanks to both of you, Matthias and Fabian, for the very useful discussion. However, I have trouble believing that the issues I got in AMICI are "only" due to a numerical integration issue, even though my change to <unitDefinition metaid="substance" id="substance" name="substance"> changes things on the order of Avogadro's number. The Copasi plots I included show a complete change in the behavior of the reaction. What previously looked like an unbounded species now looks like normal production/degradation leading to a steady state. For example, if production used Avogradro's number but degradation didn't, the species would be unbounded without numerical issues coming into play. The errors I was getting in AMICI look like this:

[Warning] AMICI:NaN: AMICI encountered a NaN value at index 0/286 in fxdot!
[Warning] AMICI:NaN: AMICI encountered a NaN value at index 0/688 in p!
[Warning] AMICI:NaN: AMICI encountered a NaN value at index 55/469 in w!
[Warning] AMICI:CVODES:CVode:OTHER: AMICI ERROR: in module CVODES in function CVode : The right-hand side routine failed at the first call.
[Warning] AMICI:simulation: AMICI simulation failed:
Steady state computation failed. First run of Newton solver failed: No convergence was achieved. Simulation to steady state failed. Second run of Newton solver failed: No convergence was achieved.

This looks very much like what I got previously when preequilibration failed and we passed NaN values in the sensitivity to the next integration step. This is why I decided to use Copasi to fix the steady-state calculation before returning to PETab/pyPESTO/AMICI.

I think I once learned that units in SBML are just annotation, and then forgot that at some time. But, in any case, I absolutely need to know which units are used in the reaction, since the values otherwise mean nothing. So, the least I can do is review all units I am using and annotate the model accordingly. I will also reread the HasOnlySubstanceUnits documentation in SBML and AMICI.

FFroehlich commented 3 years ago

Thanks to both of you, Matthias and Fabian, for the very useful discussion. However, I have trouble believing that the issues I got in AMICI are "only" due to a numerical integration issue, even though my change to <unitDefinition metaid="substance" id="substance" name="substance"> changes things on the order of Avogadro's number. The Copasi plots I included show a complete change in the behavior of the reaction. What previously looked like an unbounded species now looks like normal production/degradation leading to a steady state. For example, if production used Avogradro's number but degradation didn't, the species would be unbounded without numerical issues coming into play. The errors I was getting in AMICI look like this:

[Warning] AMICI:NaN: AMICI encountered a NaN value at index 0/286 in fxdot!
[Warning] AMICI:NaN: AMICI encountered a NaN value at index 0/688 in p!
[Warning] AMICI:NaN: AMICI encountered a NaN value at index 55/469 in w!
[Warning] AMICI:CVODES:CVode:OTHER: AMICI ERROR: in module CVODES in function CVode : The right-hand side routine failed at the first call.
[Warning] AMICI:simulation: AMICI simulation failed:
Steady state computation failed. First run of Newton solver failed: No convergence was achieved. Simulation to steady state failed. Second run of Newton solver failed: No convergence was achieved.

This looks very much like what I got previously when preequilibration failed and we passed NaN values in the sensitivity to the next integration step. This is why I decided to use Copasi to fix the steady-state calculation before returning to PETab/pyPESTO/AMICI.

I think I once learned that units in SBML are just annotation, and then forgot that at some time. But, in any case, I absolutely need to know which units are used in the reaction, since the values otherwise mean nothing. So, the least I can do is review all units I am using and annotate the model accordingly. I will also reread the HasOnlySubstanceUnits documentation in SBML and AMICI.

Oh, thats an important point, Avogadro's Number is a special case of a unit, since the unit is defined as (6.02214179 · 10^23) · dimensionless, so it is a constant times a unit and not just a unit. Using Avogadro as a unit over dimensionless or item will therefore substantially influence the simulation results in every toolbox. It should also be of note that therefore SBML unit checking will NOT warn you about messing up the units, since they are all dimensionless anyways.

FFroehlich commented 3 years ago

I would generally not recommend using Avogadro as a unit, and would rather recommend using mole. Beyond the reasons discussed above (tolerances as well as unit checking), having such large numbers in the model can easily lead to issues since multiplication, division etc of large numbers is generally problematic due to finite precision of floating point operations.

thomassligon commented 3 years ago

In the SBML documentation (sbml-level-3-version-2-release-2-core.pdf), 4.2 Model, I can see that model can contain a substanceUnits attribute, but I cannot find that in my model, and I cannot find out how to create it in Cell Designer or Copasi. The SBML documentation 4.2.2 The substanceUnits attribute states

If the Model does not define a value for this attribute, then there is no unit to inherit, and all species that do not specify individual substanceUnits attribute values then have no declared units for their quantities.

As I reported earlier, my model contains

  <model
    <listOfUnitDefinitions>
      <unitDefinition id="substance" name="substance">
        <listOfUnits>
          <unit kind="item" exponent="1" scale="0" multiplier="1"/>
        </listOfUnits>

Should I set the substanceUnits attribute directly in the model key using an XML editor instead of Cell Designer? (I plan to specify substanceUnits explicitly in all species, but it would be nice to have a default value.)

thomassligon commented 3 years ago

I have done some more testing with Copasi and I can see that the one global change I made, changing the unitDefinition of substance from mole to item, leads to a big improvement in the way the plots look in Copasi. The steady-state calculation in Copasi and AMICI is not working yet, but that was not to be expected. Thanks again for the help.