draeger-lab / ModelPolisher

ModelPolisher accesses the BiGG Models knowledgebase to annotate SBML models.
23 stars 7 forks source link

Correct the use of units and unit definitions #8

Open draeger opened 8 years ago

draeger commented 8 years ago

In input files, reactions are usually given in millimole per gram dry weight per hour (mmol / gDW / h).

The given unit needs to be decomposed into its parts:

Declare the default units on the model for each unit category: extentUnit, substanceUnit, timeUnit.

We cannot set any of the size units on the model though, because gDW is neither a volume, nor area, nor length unit. Hence, all compartments must be individually declared to be in gDW.

It is important that the given unit is programmatically decomposed rather than predefining a certain unit. In case that an input model is given with a deviating unit, say nanomole per gDW per day, this must be analogously decomposed rather than being replaced by the default unit.

draeger commented 8 years ago

At the moment, it is not entirely clear what the correct units in COBRA models would be. This issue is being discussed amongst the SBML editors. As an intermediate solution, ModelPolisher version 1.2 defines compartment units as dimensionless if these are undefined and uses mmol/gDW as substanceUnit for all species, even though this is currently not supported by the libSBML validator.

matthiaskoenig commented 7 years ago

Hi Andreas, if there is a solution for this please let me know. I mainly run in the same issue and unclear how to handle this. M

draeger commented 7 years ago

@matthiaskoenig this is still not fully solved. The upcoming HARMONY meeting could be a good occasion to discuss this topic within a broader circle.

mephenor commented 4 years ago

Are there any news with regard to this issue?

draeger commented 4 years ago

@matthiaskoenig it seems, we should set this on our agenda for discussion at HARMONY 2020.

matthiaskoenig commented 4 years ago

@draeger I am not sure if this is resolved in libsbml-experimental. I will check what is going on there with some examples (I also want to have the exact validation rule with the error message which fails). I can do this on the weekend, and will bring this up during HARMONY if there is no solution at the moment.

Schmoho commented 2 years ago

The most important issue I see with this is that JSBML - and if I'm not mistaken SBML itself - does not have a notion of a growth unit (definition).

That is, any deviating units, e.g. "nmol / gDW / sec" would need to be identified some other way.

We could do some best effort parsing here, but right now I cannot see a "correct" solution to the issue of identifying and decomposing growth units.

draeger commented 2 years ago

The genome-scale metabolic model of Corynebacterium glutamicum could serve as an example model that libSBML validates correctly. It uses multiple unit definitions and also declares the sizes of compartments and their default units.

Schmoho commented 2 years ago

For reference, this is the relevant data from said bacterium:

<listOfUnitDefinitions>
      <unitDefinition id="hour" metaid="meta_hour" name="hour">
        <annotation>
          <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:bqbiol="http://biomodels.net/biology-qualifiers/">
            <rdf:Description rdf:about="#meta_hour">
              <bqbiol:is>
                <rdf:Bag>
                  <rdf:li rdf:resource="https://identifiers.org/UO:0000032" />
                </rdf:Bag>
              </bqbiol:is>
            </rdf:Description>
          </rdf:RDF>
        </annotation>
        <listOfUnits>
          <unit exponent="1" kind="second" multiplier="3600" scale="0" />
        </listOfUnits>
      </unitDefinition>
      <unitDefinition id="fL" metaid="meta_fL" name="femto litres">
        <annotation>
          <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:bqbiol="http://biomodels.net/biology-qualifiers/">
            <rdf:Description rdf:about="#meta_fL">
              <bqbiol:is>
                <rdf:Bag>
                  <rdf:li rdf:resource="https://identifiers.org/UO:0000104" />
                </rdf:Bag>
              </bqbiol:is>
            </rdf:Description>
          </rdf:RDF>
        </annotation>
        <listOfUnits>
          <unit exponent="1" kind="litre" multiplier="1" scale="-3" />
        </listOfUnits>
      </unitDefinition>
      <unitDefinition id="mmol_per_gDW" name="millimoles per gram dry weight">
        <listOfUnits>
          <unit exponent="1" kind="mole" multiplier="1" scale="-3" />
          <unit exponent="-1" kind="gram" multiplier="1" scale="0" />
        </listOfUnits>
      </unitDefinition>
      <unitDefinition id="mmol_per_gDW_per_hr" name="millimoles per gram dry weight per hour">
        <listOfUnits>
          <unit exponent="1" kind="mole" multiplier="1" scale="-3" />
          <unit exponent="-1" kind="gram" multiplier="1" scale="0" />
          <unit exponent="-1" kind="second" multiplier="3600" scale="0" />
        </listOfUnits>
      </unitDefinition>
    </listOfUnitDefinitions>
    <listOfCompartments>
      <compartment size="NaN" spatialDimensions="3" id="c" name="cytosol" constant="true"/>
      <compartment size="NaN" spatialDimensions="3" id="p" name="periplasm" constant="true"/>
      <compartment size="NaN" spatialDimensions="3" id="e" name="extracellular space" constant="true"/>
    </listOfCompartments>
draeger commented 2 years ago

Also the declaration of default units in the model:

extentUnits="mmol_per_gDW" substanceUnits="mmol_per_gDW" timeUnits="hour" volumeUnits="fL"