Code generator: add support for external model parameters

agarny commented 4 years ago

Right now, we can generate code for a model where model parameters can be one of:

Variable of integration;
State;
Constant;
Computed constant; and
Algebraic.

However, there may be cases where we don't want some model parameters to be computed and, instead, their value to be retrieved from elsewhere. A typical case is when we want to apply a voltage clamp protocol to a model of cardiac electrophysiology. In that case, we don't want to compute the membrane potential (i.e. the typical dV/dt) and instead provide its value at a given time.

When it comes to the Analyser class, we could achieve this by updating our processModel() method. It currently takes a model as a parameter, but we could also pass it an optional list of variables for which we want to ignore the computation. Then when it comes to the generated code, we would have a callback in place of the equation that would have normally computed the model parameter. Will need to think a bit more about it, but I image that only non-VOI variables could become "external" variables.

@nickerso, we are going to need this for SPARC at some point, so I am going to work on this after issue #499. Otherwise, I remember us discussing this issue at some point, but I can't recall the details, so feel free to feed in this issue, if you remember anything. 🙂

nickerso commented 4 years ago

An initial implementation could be like as you describe, but rather than a callback (which would be dreadfully slow in a simulation, right?) the user would be able to give a list of variables that would essentially be "constant" as far as the generator is concerned. These constants would be a new array and the user just provides that array as needed to the current generated methods with everything in the right positions.

agarny commented 4 years ago

A callback would just be like calling a math function, so not that bad, I would think.

These constants would be a new array and the user just provides that array as needed to the current generated methods with everything in the right positions.

How would that work exactly? Especially with a solver like CVODE?

agarny commented 4 years ago

@hsorby, @kerimoyle and @nickerso: some headstart on how I am currently thinking about this issue. Say that a CellML file defines a variable x which evaluation depends on the prior evaluation of two other variables, y and z, then the analysis will entirely ignore y and z and consider x as being managed.

Does that make sense to you guys?

kerimoyle commented 4 years ago

Hi @agarny! That doesn't quite make sense to me ... where an internal variable (ie: one of the form which we currently have, which is computed and analysed etc) depends on others (as in x = y + z), then it's status quo - we already handle that in the generator? If we had another category of variables (maybe EXTERNAL_CONSTANT or MEASUREMENT or EMPIRICAL or whatever (is this what you mean by "managed"?) then they should be treated like we currently treat our CONSTANT and COMPUTED_CONSTANT types by the analyser? ie: check that you can find the value and that's it?

agarny commented 4 years ago

If we have x = y + z and we tell the analyser that x is an external model parameter then x would indeed be considered as some kind of constant by the analyser. However, its type wouldn't be CONSTANT or even COMPUTED_CONSTANT, but most likely EXTERNAL. As for the y and z, they would simply be dropped (assuming that they were only used to compute x).

Then, when it comes to the generator, it would generate something like the following for x:

variables[3] = externalVariable(voi, 3);

i.e. call a callback method, passing it the value of the VOI (for which we want the value of x) and the index of the variable of interest (i.e. 3 for x in this example).

kerimoyle commented 4 years ago

I would expect that the analyser would return an over-constrained error if you had an equation with x=y+ z as well as x = external constant? That's two sources of its value? And are the externals going to be restricted to VOI ... if so, why?

agarny commented 4 years ago

I would expect that the analyser would return an over-constrained error if you had an equation with x=y+ z as well as x = external constant? That's two sources of its value?

x = y + z would be dropped in favour of x = external value. So, only one source.

And are the externals going to be restricted to VOI ... if so, why?

I would say so. At the end of the day, you want to compute the model and for this you need to know the value of x at a given "time" (i.e. value of the VOI). What people do in the callback method is completely up to them (e.g. lookup table with linear interpolation, compute a value based on whatever).

kerimoyle commented 4 years ago

... how does that work with your example of specifying a voltage in a clamp experiment then? Sorry, I'm quite confused ...

agarny commented 4 years ago

The voltage clamp protocol would be implemented in the callback method.

kerimoyle commented 4 years ago

Hmmm ... it still feels too restrictive to me. Is there a fundamental difficulty in allowing the callback to be based on any variable? In essence, the callback is just another way of writing resets, just applied (probably) more frequently, and (perhaps) in a continuous way? So unless there's something really tricky about allowing any kind of variable, then I would vote for that ... I'd also still expect the analyser to notify the user where some information would be dropped (as above); perhaps not an error, but at least a warning.

agarny commented 4 years ago

Hmmm ... it still feels too restrictive to me. Is there a fundamental difficulty in allowing the callback to be based on any variable?

We could certainly make our call to the callback method look something like:

variables[3] = externalVariable(voi, states, rates, variables, 3);

However, this would be done from our computeRates() and computeVariables() methods, which means that we probably ought to "compute" all the EXTERNAL variables before computing anything else. Indeed, the contents of rates and variables changes as a result of calling computeRates() and computeVariables(). So, we don't want externalVariable() to be called with the contents of rates and variables being a mix of values at time t and t+dt.

In essence, the callback is just another way of writing resets, just applied (probably) more frequently, and (perhaps) in a continuous way?

That wasn't the original intent, but I guess it could be used for that purpose indeed.

So unless there's something really tricky about allowing any kind of variable, then I would vote for that ...

I am certainly fine with that, providing all EXTERNAL variables are computed before computing rates and other variables.

I'd also still expect the analyser to notify the user where some information would be dropped (as above); perhaps not an error, but at least a warning.

Yes, that was my plan. I agree that generating an error would be wrong, but I don't think that generating a warning is good either. A hint wouldn't be right either. I would be in favour of adding another type of issue: INFORMATION.

nickerso commented 4 years ago

Couple of random comments:

Not sure what is meant by completing ignoring y and z? Even if not used in computing anything, they could still be data generator targets in SED-ML, so they'd still be available, right?

Presumably the user of this is going to have access to the full state of the model in their callback function, so I'd go with that externalVariable method only providing the "index" of the required variable and document it such that the user knows what is expected. Or if there is a concern that the user-provided arrays for the current state of the model are not updated for the specific integration point required, then I'd be including the state in the callback method as well as the variable of integration. Otherwise as @kerimoyle points out, this would be restricted to the simplest use case and wouldn't help, for example, if you were setting a value based on the current concentration of something cool rather than just loading a series of data over time...

kerimoyle commented 4 years ago

And another thought ... we don't have code generation for resets yet, right? Because we don't know how an implementation might need those data? How is this any different? Or, put another way, if we can "do" external data sources, surely we can "do" resets in the same way since they're a tiny subset of this arrangement?

agarny commented 4 years ago

Not sure what is meant by completing ignoring y and z? Even if not used in computing anything, they could still be data generator targets in SED-ML, so they'd still be available, right?

Good point. I clearly hadn't thought about this! Ok, so we won't drop anything at all, just recategorise things, if needed. IOW, here, y and z would be left unchanged and x would be recategorised as an EXTERNAL variable.

Presumably the user of this is going to have access to the full state of the model in their callback function, so I'd go with that externalVariable method only providing the "index" of the required variable and document it such that the user knows what is expected. Or if there is a concern that the user-provided arrays for the current state of the model are not updated for the specific integration point required, then I'd be including the state in the callback method as well as the variable of integration. Otherwise as @kerimoyle points out, this would be restricted to the simplest use case and wouldn't help, for example, if you were setting a value based on the current concentration of something cool rather than just loading a series of data over time...

Ok, thanks to @kerimoyle, we agree that it would be good for the callback method to, somehow, get access to the states, rates and variables arrays. Now, when it comes to the callback method, it may or not be possible to get access to those arrays. When it comes to C code generation, the only way to get access to those arrays (if they are not passed to the callback method) is to declare them as global arrays, which is not neat.

So, once again, we don't want to force the user to do this or that, or even assume this or that. So, an always-working solution is to pass those arrays to the callback method.

And another thought ... we don't have code generation for resets yet, right?

No, we don't... yet.

Because we don't know how an implementation might need those data? How is this any different? Or, put another way, if we can "do" external data sources, surely we can "do" resets in the same way since they're a tiny subset of this arrangement?

@hsorby (especially) and @MichaelClerx worked on this at HARMONY earlier in March. I honestly can't recall the details, so I will leave it to @hsorby to describe "our" reset strategy.

cellml / libcellml

Code generator: add support for external model parameters #627