cellml / libcellml

Repository for libCellML development.
https://libcellml.org
Apache License 2.0
17 stars 21 forks source link

Component::setMath() doesn't set the cellml namespace #708

Open kerimoyle opened 4 years ago

kerimoyle commented 4 years ago

There's a difference in behaviour between parsing a model's math and setting it through the API. This test prints two different MathML strings, and the "fixed" one doesn't have the cellml namespace added, so generates all kinds of libxml2 validation errors. I think it should be treated consistently between the two - either specifying the namespace on the model or on the MathML, but not depending on whether it's done through the parser or the component->setMath function.

The parser reads this file (note location of namespaces):

<?xml version='1.0' encoding='UTF-8'?>
<model name="my_model" xmlns="http://www.cellml.org/cellml/2.0#" xmlns:cellml="http://www.cellml.org/cellml/2.0#">
    <component name="my_component">
        <variable name="x" units="dimensionless"/>
        <math xmlns="http://www.w3.org/1998/Math/MathML">
            <apply>
                <eq/>
                <ci>x</ci>
                <cn cellml:units="dimensionless">1</cn>
            </apply>
            <apply>
                <eq/>
                <ci>x</ci>
                <cn cellml:units="dimensionless">3</cn>
            </apply>
        </math>
    </component>
</model>
TEST(Validator, KRM)
{
    auto validator = libcellml::Validator::create();
    auto parser = libcellml::Parser::create();
    auto model = parser->parseModel(fileContents("analyser/overconstrained.cellml"));

    validator->validateModel(model); 
    printIssues(validator);   // No validation errors.

    auto analyser = libcellml::Analyser::create();
    analyser->analyseModel(model);
    printIssues(analyser);
    printModel(model);

This prints:

Error 1: 
Variable 'x' in component 'my_component' is computed more than once.
MODEL: 'my_model'
    UNITS: 0 custom units
    COMPONENTS: 1 components
        [1]: my_component
            VARIABLES: 1 variables
                [1]: x [dimensionless]
          Maths in the component is:
<math xmlns="http://www.w3.org/1998/Math/MathML" xmlns:cellml="http://www.cellml.org/cellml/2.0#">
  <apply>
    <eq/>
    <ci>x</ci>
    <cn cellml:units="dimensionless">1</cn>
  </apply>
  <apply>
    <eq/>
    <ci>x</ci>
    <cn cellml:units="dimensionless">3</cn>
  </apply>
</math>
    // Adding in a "fixed" mathml string:
    std::string mathml =
        "<math xmlns=\"http://www.w3.org/1998/Math/MathML\">\n"
        "  <apply>\n"
        "    <eq/>\n"
        "    <ci>x</ci>\n"
        "    <cn cellml:units=\"dimensionless\">3</cn>\n"
        "  </apply>\n"
        "</math>\n";
    model->component(0)->setMath(mathml);

    printModel(model);

Prints:

MODEL: 'my_model'
    UNITS: 0 custom units
    COMPONENTS: 1 components
        [1]: my_component
            VARIABLES: 1 variables
                [1]: x [dimensionless]
          Maths in the component is:
<math xmlns="http://www.w3.org/1998/Math/MathML">
  <apply>
    <eq/>
    <ci>x</ci>
    <cn cellml:units="dimensionless">3</cn>
  </apply>
</math>

... with validation errors of:

Error 1: LibXml2 error: Namespace prefix cellml for units on cn is not defined. Error 2: CellML identifiers must contain one or more basic Latin alphabetic characters. Error 3: Math cn element with the value '3' does not have a valid cellml:units attribute. Error 4: W3C MathML DTD error: Namespace prefix cellml for units on cn is not defined. Error 5: W3C MathML DTD error: No declaration for attribute cellml:units of element cn.

nickerso commented 4 years ago

Parsing a model you have all the contextual information to be able to set all the required namespaces (https://github.com/cellml/libcellml/blob/develop/src/parser.cpp#L523). When a user calls setMath() directly, it is up to them to make sure the required namespaces are defined. Pretty sure that should be in the docs...or it needs to be added :)

kerimoyle commented 4 years ago

Hmmm ... that would mean that calling getMath() and then setMath() would break it? Shouldn't calling getMath() just return the unadulterated string, rather than the one with the namespace added by the parser?

nickerso commented 4 years ago

getMath() returns the current math string - so get-then-set after parsing would be fine. I don't see an issue with the parser setting the namespaces in the math string when it parses a model, that is also what any user code should be doing. Otherwise there is no way to know what prefix has been used in the namespaces - you definitely can't and shouldn't assume, for example, that a cellml namespace prefix has been defined and is the CellML 2.0 namespace (as you are doing in your "fixed" MathML string).