Open gdudziuk opened 4 years ago
An update. As long as this implementation relies on the form (Eq. 2)
of the target function, it is important that {std}_j
do not depend on the dynamic parameters r
nor on the scale parameter s
, neither explicitly nor implicitly, e.g. via model observables. This excludes error models with relative measurement noise, e.g. std(q1,q2,s,r) = sqrt(q1^2 + q2^2*y(s,r)^2)
, even if the error parameters q1
and q2
are constant.
So I need to write another commit to fix this. I will enforce the use of experimental error if the hierarchical optimization is enabled. Such a solution seems restrictive but:
s
nor r
and which error parameters are constant always can be represented as concrete error values for particular replicates, i.e. can be represented as experimental errors. So in fact such a solution is not more restrictive than necessary.s
or r
.I will commit in a moment. (Edit: Done, cf. commit 1b7fb04c)
Dear gdudziuk. Thanks for your contribution. At a first glance, it looks very valuable. I also deeply acknowledge your efforts in documentation. We check the code during the next days.
Dear Sorry for my delayed response. I startet testing and found the following problem: You (sometimes) do not consider that there might be several models, i.e. ar.model needs an index ar.model(m). So you always need a loop for m=1:length(ar.model), ar.model(m) ...
Example: Lines 24-25 in arInitHierarchical.m: xSyms = cellfun(@sym, ar.model.x, 'UniformOutput', false); zSyms = cellfun(@sym, ar.model.z, 'UniformOutput', false);
Could you please update the code accordingly? Thanks in advance. Best, Clemens.
Another question: I did the following:
arCopyBenchmarkModels('Swa')
cd Swameye_PNAS2003\
Setup
arInitHierarchical
arDisableHierarchical % Switch back to the "normal mode"
There is a warning that I don't understand and obviously no scaling parameters were found. Why?
` arInitHierarchical Warning: Unable to prove 'offset_tSTAT == 0'. In symengine In sym/isAlways (line 42) In arInitHierarchical (line 80) Warning: Unable to prove 'offset_pSTAT == 0'. In symengine In sym/isAlways (line 42) In arInitHierarchical (line 80) Warning: No scale parameters found for hierarchical optimization. In arInitHierarchical (line 224)
arDisableHierarchical % Switch back to the "normal mode" Error using arDisableHierarchical (line 18) Hierarchical optimization is not enabled. You have not called arInitHierarchical or there are no scale parameters suitable for hierarchical optimization in your observables.`
You (sometimes) do not consider that there might be several models, i.e. ar.model needs an index ar.model(m).
Right. Fixed in commit a35790f1. Thank you for pointing this out.
In addition I have found another small issue - I will fix it this week. I will also take a look at the model of Swameye and let you know what's going on there.
I have played a bit more with example models shipped with D2D and discovered some issues in my code. They are already fixed. From the model "Becker_Science2010" I have learned that D2D allows repeated measurements in the data sheets (which I should had figured out earlier; see commit 7feb387f) and from the model "Swameye_PNAS2003" I have learned that there also may be missing measurements in the data sheets (see commit 14319ec5). I hope that now hierarchical optimization works correctly with all possible data.
Concerning the warning "Unable to prove 'some_parameter == 0'.", this is a warning raised by the symbolic function isAlways
, which is used in arInitHierarchical
. Since this warning was occurring during normal operation of arInitHierarchical
, I have decided to switch it off (see commit 050f1146).
Concerning the warning "No scale parameters found for hierarchical optimization.", this is correct since there are no such parameters in the model "Swameye_PNAS2003". At the moment, my implementation supports hierarchical optimization only for observables of form scale*xz
and the model of Swameye has observables of form scale*xz + offset
, which is a different case.
I suspect that this may be a problem for you. I can see that many D2D examples involve not only scales but also offsets so you probably would like to have the hierarchical optimization generalized to cover the setups involving offsets. But if this is acceptable for you to have that simple implementation as it is now, I can provide some toy model which demonstrates the use of the hierarchical optimization in the present state of development.
Of course, for the test purposes you can get hierarchical optimization work with the model of Swameye by manually removing the offset parameters from that model.
So, what do you think, @clemenskreutz?
I have implemented a simple form of hierarchical optimization in D2D and I would like to propose to pull this feature upstream.
The method of hierarchical optimization is described in detail in (Weber et al., 2011) and (Loos et al.,2018). Let me however illustrate this method for the following simplified parameter estimation problem:
where
Y_j
and{std}_j
represent experimental measurements and associated standard deviations given or arbitrary scale andy(p,t_j)
is the model observable corresponding to the measurementsY_j
. Assume that the observabley
is of formy(p,t)=s*x(r,t)
wherex(r,t)
denote model species (or some derived variable) corresponding to the measurementsY_j
,s
is a scale parameter to be estimated along with the dynamic parametersr
andp=(r,s)
. Then, the above optimization problem becomesThe core observation is that the function
J_2
is quadratic and positive definite w.r.t. the scales
, assuming that at least some ofx(r,t_j)
are nonzero. Thus, given parametersr
, the optimal scales=s(r)
is uniquely determined and the optimization problem(Eq. 2)
is equivalent toThe optimal scales
s(r)
may be expressed explicitly. To do so, it is enough to resolve the condition\grad_{s} J_2(r,s)=0
w.r.t.s
, which gives:Having this, we may also explicitly compute the gradient of
J_3
w.r.t.r
(assuming that we can compute the sensitivities ofx
w.r.tr
) and use it to feed any gradient-based optimization algorithm.To sum up, thanks to the specific form of the observable
y
, namelyy=s*x
, we can replace the optimization problem(Eq. 2)
with the equivalent problem(Eq. 3)
which has less parameters to optimize. This is the method of hierarchical optimization. From my experience (and as one could expect), this can significantly improve the optimization speed. The trade off is that we need to assume a specific form of the observabley
. Nevertheless, I think that the case ofy=s*x
is common enough to implement hierarchical optimization for it.Theoretically, the method outlined above may be generalized to more complex observables, like e.g.
y=s*x+b
whereb
denotes an offset parameter. Such observables perhaps cover vast majority of practical applications. Havingy=s*x+b
we proceed analogously as above, namely instead of(Eq. 2)
we haveand we need to compute both the optimal scale
s=s(r)
and the optimal offsetb=b(r)
. The problem with such generalization is very pragmatic - the formulas fors(r)
andb(r)
become much more complex than in the simple case ofy=s*x
discussed above. For this reason I have implemented the hierarchical optimization just for the latter simple case and to be honest I think I will not implement the generalization toy=s*x+b
soon.Usage
There is function
arInitHierarchical
which detects all observables of formy=s*x
end enables the "hierarchical mode" of optimization. To get it work, just putarInitHierarchical
beforearFit
(orarFitLHS
or anything like that). There is also functionarDisableHierarchical
which switches back to the "normal mode". So an example setup with hierarchical optimization may look like this:The optimal scales
s(r)
are computed by functionarCalcHierarchical
, which is called fromarCalcRes
andarCalcSimu
. As a result, any D2D function that internally callsarSimu
can benefit from the "hierarchical mode". For example, in the following snippetthe scale parameters (if any) will be automatically fine-tuned prior to generating the plots.
Limitations
The implementation in this pull request strongly relies on the specific form
(Eq. 2)
of the optimization problem. Consequently, any D2D feature that modifiesJ_2
as in(Eq. 2)
into something else cannot be used with the hierarchical optimization, at least until suitable generalizations are implemented. Up to what I know, these interfering features are:r
nors
, even implicitly; cf. the comment below)arCollectRes
, namely:With this features enabled, the current implementation of the hierarchical optimization works incorrectly since the optimal scales
s(r)
are in such case different than in(Eq. 4)
. FunctionarCalcHierarchical
raises an error if any of these features is in use. I have done my best to make the above list complete. However, I'm a newcomer to D2D so there may be some other features that I don't know yet and that may interfere with the hierarchical optimization by modifying the merit functionJ_2
. So if anybody finds that there are some more features like that in D2D, please let me know.Further development
The possible directions of generalization of the method are:
y=s*x+b
, possibly with some restricting assumptions.