SED-ML / sed-ml

Simulation Experiment Description Markup Language (SED-ML)
http://sed-ml.org
5 stars 2 forks source link

Extend data model to keep track of data types of model and algorithm parameter changes #70

Open jonrkarr opened 3 years ago

jonrkarr commented 3 years ago

Presently, ChangeAttribute.newValue has data type of string. It would be helpful to keep track of the true data type (e.g., Boolean, integer, float, string, etc.). This would facilitate semantic interpretation of SED-ML, separate from specific simulation tools. Right now, because the data types are only known to the simulation tools which interpret SED-ML (and the simulation tools don't clear communicate this information), it can be difficult to know what the true data type is.

Examples of where this is relevant include

In BioSimulators, we've introduced a related concept of the default value of an algorithm parameter. The purpose of this is to provide users a graphical user interface which displays this default and allows users to choose a different value. For numeric parameters, this would display a slider. For enumerated parameters, this would display a select box. This also requires us to keep track of the data type of each default parameter value.

luciansmith commented 3 years ago

I agree that this can be an issue, but I am unsure what your proposed solution is. Do you want the spec itself changed in some way? Or did you want the library changed? Or something else?

My first thoughts are that this is most appropriately solved at the tool level. But there might be common routines that could be put into libsedml, such as 'parse this string and tell me its most likely data type'.

jonrkarr commented 3 years ago

I'm proposing to add this to the specification. Something like <changeAttribute ... newValueType="xs:integer"/>. Tools that build on top of the specification (e.g., libSED-ML) would be able to take advantage of this.

The data types are currently known, but only to the individual simulation tools. E.g., individual tools know how to deserialize values of their own parameters. But in the general case, if libSED-ML (or another tool) tried to guess these data types, it could be wrong. To enable interpretation, validation, manipulation, etc. separate from the individual simulation tools, it would be helpful to know these types.

For algorithm parameters, one could argue that the type information should be captured by KiSAO rather than SED-ML. Arguably annotation with a particular KiSAO should convey a particular data type. However, this would likely be more work for developers to support.

jonrkarr commented 3 years ago

After reading further, I realized that the type of parameter/@value is number. I didn't notice this before because libSED-ML treats this as a string. This is even more constraining. There are many instances in which parameter values are not numeric-valued. Supporting a broader range of data types will be essential to broader use of SED-ML.

fbergmann commented 3 years ago

There are two different things here: The manipulations of the model document based on modifying the xml tree (using AddXML, RemoveXML, ChangeXML to add / remove / replace elements ... and ChangeAttribute / ComputeChange to change the value of the attribute itself). The Parameter from the listOfParameters in the ComputeChange, is only there to appear in the mathematical equation. As such it having a double value is I believe not a hindrance. (And in libSEDML the parameter value is a double for those)

As for the algorithm parameter, there we have changed the interpretation in L1V4, so that the value can be a string, hoping that the KISAO term declares how to interpret that string. Additionally L1V4 allows nesting of algorithm parameters. As libSEDML (from L1V4 branch, from which i build the python bindings) supports L1V4, the algorithm parameters there are present, and sets the value type as string.

jonrkarr commented 3 years ago

Glad to hear the data type of parameter values is changed to string. Strings are much more flexible in practice, although the string type doesn't capture the semantic meaning of the type of each parameter. In addition to this change, I think it could be helpful to allow annotation of types, for example using the XML schema data types.