Prediction Profile Likelihoods estimates negative state values

Data2Dynamics / d2d

a modeling environment tailored to parameter estimation in dynamical systems

https://github.com/Data2Dynamics/d2d

57 stars 28 forks source link

Prediction Profile Likelihoods estimates negative state values #148

Closed elbaraim closed 4 years ago

elbaraim commented 4 years ago

Dear d2d team,

I have been recently using the routines for prediction profile likelihoods in one of my models. I got negative lower bounds for some of the model states I am evaluating, which values should be zero (as predicted by MCMC sampling). These are the settings I am using:

PPL_options('Integrate',false)
state_indexes = find(cellfun(@isempty,strfind(ar.model.xNames,'cummulative')));
arPPL(1,1,state_indexes,linspace(1,35,21),0);

As you can see down below, model uncertainties are higher predicted by MCMC sampling but never negative. While this is not the case for PPLs.

Could you help me with this?
Am I doing something wrong?

These are the results I obtain by PPL calculation: Screenshot from 2020-03-02 10-55-30

These are the results I obtain by MCMC sampling: Screenshot from 2020-03-02 10-50-20

Thank you in advance for the help.

litwiti commented 4 years ago

Hey @elbaraim,

the same issue actually arises in the wiki example, Computation of integration based prediction bands, if you look at the plots.

It seems like the prediction bands you calculated are actually obtained from validation confidence intervals, not from prediction confidence intervals, see Likelihood based observability analysis and confidence intervals for predictions of dynamic models for details on this topic. In short: Prediction Confidence Intervals are a measure of uncertainty for model predictions, while Validation Confidence Intervals additionally include the uncertainty of a possible measurement. These are generated from Prediction Profiles, which take the model prediction as argument and from Validation Profiles, which take a data point as argument.

I am no expert on the prediction band functionalities, but I would make this educated guess: Since you have set PPL_options('Integrate',false) , prediction bands are not generated by continuous integration but by calculating the prediction/validation confidence intervals at different time points. Validation profiles have the property that evaluation for negative data values does not constitute a problem, since data only enters the objective function over the squared residuals. Hence, validation confidence intervals can possibly have negative boundaries, even if this makes no sense for model predictions. However, prediction confidence intervals respect bounds given by the model structure.

The default option whether to use validation or prediction confidence intervals in the prediction band algorithm is the validation mode. Since you have not set this explicitly, I would assume that you have calculated validation confidence intervals. In principle, this should be fixed by setting the option PPL_options('Integrate',false,'doPPL',true). But since prediction profiles are derived from validation profiles, their calculation can be possibly problematic.

Thus, if the prediction band looks odd for some time points using the prediction mode, you can check whether the prediction profile crossed the confidence threshold with arPlotPPL. If it did not cross the confidence thresholds to both sides of the optimum, you can recalculate your prediction bands with the additional options

PPL_options('Integrate',false,'doPPL',true,'xstd',sigma,'n_steps_profile',N),

where N is the number of points used to sample a profile and sigma is the standard deviation of the validation data point. Setting sigma to smaller values makes prediction and validation profiles more similar, since they should in theory coincide if sigma goes to zero. However, setting it too close to zero can lead to technical issues.

I hope this suffices to produce sensible results in your case!

elbaraim commented 4 years ago

Thank you so much for the help! I will give it a try and let you know how it goes :+1:

elbaraim commented 4 years ago

Hi @litwiti

I confirm, it works now as expected. Thank you again for the support.

Regarding the number of steps required for the profile, I have seen that:

when choosing a large number, e.g. 1000, which I need for at least one of my model states (specifically due to the lower bound) the profile calculation continues even if the likelihood threshold is already achieved (for instance on the upper bound of this specific state I just mentioned).

I attach here an screenshot of the result: Screenshot from 2020-03-04 10-46-36

You can see that the lower bound still needs a larger number of steps, while for the upper bound is already sufficient. However the upper bound is calculated for all defined steps (instead of, e.g., automatically stopping when the threshold is reached).

Note: I am using default xstd value.

This substantially increases the computation time.

Is there an easy way to stop the calculation once the threshold is reached? e.g. via some special arguments?

Thank you in advance (again) :smile:

litwiti commented 4 years ago

Hey @elbaraim,

I'm currently revising arPPL and the related functions and there seem to be some limits on which cases can be handled reasonably well by the algorithm. Unfortunately, the case you presented is problematic in the current implementation.

First of all, to respond to your issue, the upper bound calculation should be cancelled once the threshold is exceeded by some value of about 1 on the PPL-scale. Thus, your proposal of stopping if the threshold is exceeded is already implemented and should also be functional by default. Since the prediction profile looks unnaturally straight to the right side of the minimum, I suspect that plotting the profile just connects two points on the right side, yielding one straight line, such that there aren't actually many sampled points on this side. I actually managed to construct an example by use of arPPL where you can see this effect:

PPL_example

Since increasing step size proved to be not particularly efficient, I would propose to rely on 'xstd' as recommended earlier. Decreasing this value improved the sampling above to this result:

PPL_example_fixed

Mind the different scale. As you can see, the prediction profile is sampled adequately now. Setting xstd at least close to the magnitude of the prediction profile width should hopefully achieve this in your case too. So what I imagine could work would be:

Plot the prediction profiles for cases (times, state) which seem to be sampled poorly. Estimate the profile width sigma by eye (I used a sigma of 0.001 in my example. Remember: Smaller values should work better, but could be numerically unstable).
Use PPL_options with your previous arguments and additionally use the name-value pair 'xstd',sigma which you estimated for one specific time point and state. Call arPPL for this time point and state only.
Check whether the profile looks good now and repeat step 2 with another sigma if not.
Repeat this for all problematic cases, if feasible.

I can not guarantee that this will work, but it could possibly help. Since this is not an especially user-friendly fix, the function is currently under revision. But if you'd like to try this anyway, feedback would facilitate the development process immensely.

elbaraim commented 4 years ago

Sure I will give it a try. I am happy to help! And I will let you know how it goes. Thanks for the advise and support.

elbaraim commented 4 years ago

Hi @litwiti

As a first try for some of the model states (and time points) that I want to profile, it worked now following your suggestion on adjusting the xstd value. However some other points still struggle which I think I should re-run the setting with a higher number of steps and then do the tuning that you suggested. This will take me longer time to test and I still want to provide with a quick feedback now.

Some things I encountered which may be of interest for you:

When trying to rerun the arPPL for a specific state and time point using the new xstd value I got the following error: The prediction profile you want to compute for t=2.700000e+00 and state 5 already exists. If you want to overwrite them, delete the value at ar.model.data/condition.ppl.ub_fit(_vpl) The prediction profile you want to compute for t=4.400000e+00 and state 5 already exists. If you want to overwrite them, delete the value at ar.model.data/condition.ppl.ub_fit(_vpl)
First thing is that ar.model.condition.ppl.ub_fit does not exist as a field in the ar struct.
```
>> ar.model.condition.ppl.ub_fit
Reference to non-existent field 'ub_fit'.
```

The work-around is solved by setting to nan (i.e. replacing the Inf values) the corresponding entries in the following:

ar.model.condition.ppl.ppl_ub_threshold_profile(1:4,4)=nan;
ar.model.condition.ppl.ppl_ub_threshold_profile(1:4,4)=nan;

Then the function will run with no problem. Maybe this could be automatically implemented.

Regarding the topic on higher number of steps: Another thing that I observed is that if I want to run only one of this "problematic" cases for a longer profile steps (while keeping the old results untouched), the following error occurs:

Subscripted assignment dimension mismatch.

Error in arPredictionProfile (line 177)
        ar.model(m).(data_cond)(c).ppl.xtrial_profile(ts_tmp, jx,:) = xtrial_tmp;

Error in arPPL (line 129)
        arPredictionProfile(t, ppl_general_struct, true, 0, []);

That somehow indicates the initial choice of number of steps (on top of the xstd values) is something to consider as well.

I hope this information is valuable :smile:

litwiti commented 4 years ago

Hey @elbaraim,

first of all, thanks for the feedback! Knowing which bugs occur makes troubleshooting much easier. It's good to know that to have some first-hand experience that setting xstd actually is a valid option to get to better prediction profiles. I've uploaded a new version of some functions now, but these are only small fixes and mostly just a load of comments in the code.

About the issue that re-running arPPL leads to premature termination of the algorithm: I guess this is intended this way, since you do not want to overwrite already existing results without getting confirmation from the user prior to deleting the results. But the issue you mentioned is relevant anyway, since the occuring error message was obsolete: The field ar.model.data/condition.ppl.ub_fit(_vpl) does not exist anywhere in the current implementation. The work-around you found is exactly what the user is intended to do in this situation: By setting the specific entries in ar.model.condition.ppl.ppl_ub_threshold_profile corresponding to the time points and states that you want to re-run to NaN, the function is forced to re-evaluate them if you use arPPL again. Changing xstd prior to the second run should then not be an issue.
However changing the number of steps, i.e. n_steps_profile, for a second run is not possible at this point. This comes down to exactly this dimension mismatch that you stumbled upon, which occurs because the containers which store the profile results are not increased in size when re-running arPPL, producing this mismatch (although I would imagine that this is something which could be fixed in a reasonable amount of time).
Thus, as a user you need to specify the number of profile steps correctly at the beginning, meaning that you can not use the results you've already obtained. Since choosing too many steps is not particularly troublesome (except for producing a lot of NaNs), this would be the way to go when running arPPL for the first time.

I hope that the presented information is enough to completely fix your prediction bands (that is if you are still interested to actually obtain them). If I knew whether this approach worked in your case, I could summarize the ideas in the wiki for other users who run into similar problems. So if you retry this and it works (or doesn't), feedback would be very welcome.

elbaraim commented 4 years ago

Hi @litwiti I will keep you updated. Currently I am testing the pipeline is a rather small model (<10 parameters). I was concern about the fact of setting to large number of integration steps (for all states even if not needed for some) since my goal in the long run is to compute prediction profiles in larger models.

How can I more or less estimate what would be a good choice of number of steps ? I would assume this would be a "blind" choice by try-and-error.

Tuning then xstd value and re-run for a subset of the states should be fine :+1:

If you like, we can keep this issue open and I will get back here once I obtain my final results to provide further feedback.

litwiti commented 4 years ago

Hey @elbaraim,

About the number of steps for each profile you should specify: Setting n_steps_profile to large numbers should usually not be problematic from a technical point of view, since this has nothing to do with integration but specifies how many steps are used for one prediction profile in each direction of the optimum. However, using a large number of steps is inefficient if xstd is chosen poorly, because then the algorithm will try all steps although there is no real chance of getting to the threshold. Thus, I would suggest setting n_steps_profile = 200. If the prediction profile can not be sampled in 200 steps, it is most likely to a bad specification of xstd anyway.

I am currently trying to find an efficient workflow for the PPL-functions to make an analysis of prediction bands in large models more accurate and feasible. Right now, I would suggest the following:

Identify the model trajectories of interest and a set of times where the prediction profiles should be calculated. Caution: As an empirical observation, using t = 0 can often be problematic if initial conditions are fixed. Thus, I would recomend using another time point close by (but not too close).
For each trajectory, estimate a xstd by looking at the trajectories. In this setting, estimation means that you should get a feeling for the scale of the data and choose a small xstd accordingly. For example, if your trajectory spans a range of 0-100 on a data scale, sigma = 1 would about be a reasonable first choice.

Looping over all trajectories of interest, i.e. looping over different models, conditions and observables, calculate the prediction band for a vector of times which must in principle be uniquely specified for each trajectory:

loop_list = {{m1,d1,ix1,ts1,sigma1},{m2,d2,ix2,ts2,sigma2},...};
for ii = 1:length(loop_list)    
PPL_options('Integrate',false,'doPPL',true,...
    'n_steps_profile',200,'xstd',loop_list{ii,5});
arPPL(loop_list{ii,1},loop_list{ii,2},loop_list{ii,3},...
    loop_list{ii,4},1);
end

Check the results and re-run problematic experimental conditions with smaller values of xstd (which most likely is the issue for poor sampling). This is done by setting the corresponding ar.model.condition.ppl.ppl_ub_threshold_profile to NaNs and re-running the code above with newly specified values for sigma, since existing profiles are simply skipped.
If the bands should be sampled more densely at some time points, these time can also be added after the first calculation.

Note that in the third step the given code usually simplifies to several loops which can be specified more easily. For example, I was testing the prediction band calculation on the model ...\Examples\Raia_CancerResearch2011, using the following code:
```
Setup;
```

sigmas = [0.01,0.2,0.1,0.02,10,0.02,2,1]; % determine sigmas from trajectories ts = [1,3,10:10:130]; % avoid t = 0

for ii = 1:3 % loop over first 3 observables for example for jj = 2:4 % loop over conditions % condition 1 are mostly trajectories fixed to zero, % for which PPL can not be sampled
PPL_options('Integrate',false,'doPPL',true,... 'n_steps_profile',200,'xstd',sigmas(ii)); arPPL(1,jj,ii,ts,1); end end


![Raia_Example](https://user-images.githubusercontent.com/60781115/76238077-9457fa00-622f-11ea-95c8-70b7f0d1953a.png)

* Since calculation of prediction profiles can nonetheless be problematic in the current implementation, a more efficient and reliable calculation of prediction profiles is currently under construction and will soon replace the existing algorithm. This will hopefully lead to faster and more reliable prediction bands. I would suggest waiting until this change has been made before trying to tackle large models.

* And sure, if you are trying this workflow an update on if this worked reasonably well would be much appreciated.

elbaraim commented 4 years ago

Hi @litwiti Thank you very much for all the detailed steps and hints. Very much appreciated :+1:

Of course, I will keep you updated. I am happy to provide feedback!

litwiti commented 4 years ago

Hey @elbaraim,

I have replaced the currently implemented prediction/validation profile calculation (which was just a basic subfunction in the PPL algorithm) with the independent function VPL, which is documented in the article Estimating Prediction Uncertainty. After some testing on how well this worked when inserted into the PPL algorithm, I am pretty certain that the prediction profile calculation in arPPL should work a lot better now.
However, algorithm performance and robustness still depends on a good choice of xstd. Hence, don't hesitate to use arPPL with different xstd-values even along the same trajectory, e.g. if the prediction uncertainty is expected to change drastically along the time course. In cases of misspecification prediction profiles may still reach their boundaries, but may then be inaccurate (which mostly can be fixed by re-running with appropriately specified xstd). This inaccuracy will be clearly visible if checked with arPlotPPL.

For now I do not intend to further look into the prediction band code. Thus, if you do not encounter any bugs in a larger model you may close this issue. Thanks for the comments so far!

elbaraim commented 4 years ago

Hi @litwiti I have an additional question: I calculate the PPL for a given state, time point and condition at 99 confidence level.

Is it possible to automatically extract the PPL upper and lower bounds for different confidence level, e.g. 90, 95 and 99%. I am not aware of this.
Or do I need to do it manually?

litwiti commented 4 years ago

Hey @elbaraim,

if I understood correctly: You used PPL_options('alpha_level, 0.01') (and other options) to calculate profiles and intervals at the 99% confidence-level.

If additionally you used the option 'Integrate', false then this should not have worked properly until now because I did not consider other confidence levels other than 95% when I reworked the PPL-functions. But I have fixed this just now, so this is not a problem anymore.

But I think your main question was:

Assuming you have calculated prediction profiles up to a 99% level, can you also extract the 95/90% levels in the existing framework?

I don't think that such a function is implemented in this framework, so you would need to do this manually. But this shouldn't be too much of a hassle, you just need to find the intersections of the prediction profiles with the desired confidence threshold.

Hope this helps.

elbaraim commented 4 years ago

Hi @litwiti Yes, I used the option listed above and also Integrate as False. Good to know that is fixed now! :) Thank you for looking into this!

Yes, about extracting the confidence intervals I thought about what you just suggested, but I was wondering whether I missed about a function that automatically does that.

elbaraim commented 4 years ago

Hi @litwiti I have what I hope may be one of my last requests :see_no_evil: The large model I am trying to calculate the prediction profiles is computationally slow and I am running this on a server which has as upper bound of computation time 48h. I am currently using 200 steps (as discussed above) and even tried to reduce to 100 steps. However, the routine gets interrupted due to reached computation time limit.

I was wondering whether it would be possible to, e.g., split the process such that one job would run the "profile to the right" and another one the "profile to the left". Then the final result would be just merging these.

Would be this easy to implement? or is there already the possibility of doing this?

litwiti commented 4 years ago

Hey @elbaraim,

Since the prediction bands are calculated by obtaining many prediction profiles for each specified condition, have you tried to parallelize your computations by starting separate jobs for e.g. each experimental condition? This would certainly split the workload, although I don't see a trivial way to merge the results (however it is certainly possible).
Calculating profiles to the Left and to the Right separately is currently not supported for prediction profiles, but I would favor the solution that I just proposed anyway. I also think that it should be possible to implement this, but I would rather not further meddle with the original code before I break any other features/compatibilities.

I hope this answers your questions.

elbaraim commented 4 years ago

Hi @litwiti Yes, I have already tried by parallelizing by experimental condition, model state and time point. Each job runs for only one of these. For one of the models I am using, this is sufficient and it works. Then I just merge the results extracting them for each individual run (with the respective mat file).

However, for the other model I am working with (actually this one https://bmcsystbiol.biomedcentral.com/articles/10.1186/1752-0509-7-41) the parallelization by exp. cond., state and time point is not sufficient. Therefore I thought about that possibility of a further layer of parallelizing "left" and "right" profiling.

I completely understand the point on the compatibility.

litwiti commented 4 years ago

Hey @elbaraim,

even if the model is large I, it seems troublesome that a single profile takes up to 48 hours and may be a hint that something does not work as intended. Since you are already working on single profile level, I can recommed use of the function VPL which was exactly built for the task of sampling single profiles and which should already be called in the PPL code. However, use of VPL allows you to easily manipulate the parameters determining how profiles are calculated. Additionally, you can probably get a feeling for why the calculation takes so long. See the article: Estimating Prediction Uncertainty.

However, it seems strange that at most 200 local fits take up to 48 hours, even if the model is large. In order to possibly enhance optimization and integration performance, you might want to refer to the article: How to check and improve optimization.