Open-Systems-Pharmacology / PK-Sim

PK-Sim® is a comprehensive software tool for whole-body physiologically based pharmacokinetic modeling
Other
104 stars 50 forks source link

Parameter Identification: Warning missing when CI of Covariance matrix is bigger than point estimate #1536

Open tobiasK2001 opened 4 years ago

tobiasK2001 commented 4 years ago

In a case like this image the point estimate has a very small value and the CI is quite large. It seems like that it includes zero and negative values, which would not be appropriate in that case. The point estimate seems to be at the lower end of the estimation range and super small. I understand that the calculation of the CI might be perturbed if the point estimate is in a local minima. I think it would be helpfull if PK-Sim makes this more visible with a warning, similar to the "Estimtated parameter near bondary warning on the results tab.

Best, Tobias

Yuri05 commented 4 years ago

The point estimate seems to be at the lower end of the estimation range

This is the problem. In such a case the PI result might be not in a local minimum of the cost function and thus CI calculation (and all related results) might be innacurate.

tobiasK2001 commented 4 years ago

Yes, but this is just the mathematical problem solved. There is an underlying usability issue, too. If the estimation boundary is set to an even lower value, the warning in the result section disappears like from here: image to here image The user might think that the problem is solved by moving the boundary. But is isn't. That`s why we need a warning here as well like e.g. here: image

Yuri05 commented 4 years ago

If the estimation boundary is set to an even lower value, the warning in the result section disappears like from here:

Cannot reproduce this behaviour. If I perform a PI where one of the bounaries was hit and then change the boundary (without performing the PI again) the warning is still shown in the results.

That`s why we need a warning here as well like e.g. here:

Agree.

Yuri05 commented 4 years ago

Ok, could reproduce. When you close and reopen the PI the results section shown the new boundaries indeed and not those used in the PI run. @tobiasK2001 Can you please create another issues for that - this is a bug.

tobiasK2001 commented 4 years ago

OMG, I am finding bugs I was not looking for :-0 . Ok I will open a new one for what you described Juri.

I was probably a bit sloppy with my description for the usability problem above. Here is another try: The user conducts PI-A and gets a point estimate close to boundary with a warning like this: image The user now thinks (maybe naively) OK let`s repeat the PI with a wider range and set bondary 10 times lower. He clones PI-A to PI-B with just lowering the lower bound and gets a result without warning like this: image

The (still maybe a bit naive ) user might think that the Identification problem is solved by moving the boundary. But is isn't. He just had a quick look at the covariance matrix and the CI there and he overlooks that the point estimate has a very small value and the CI is quite large. He also does not worry that that it includes zero and negative values, which is not appropriate in that case. That`s why we need a warning here as well like e.g. here: image image. Probably with some mouseover hint like e.g.. "CI is larger than estimated value and includes zero. PI result might be not in a local minimum, CI calculation (and all related results) might be innacurate."

Please feel free to adjust the mouseover hint were it is needed.

StephanSchaller commented 4 years ago

"CI is larger than estimated value and includes zero. PI result might be not in a local minimum, CI calculation (and all related results) might be innacurate."

I don't believe this is the right conclusion from a "wider-than-point-estimate" CI. It is an indication, that variations within the CI around the minimum that was reached don't have a large impact on the total error. This can make sense in your case if the PI is trying to max-out the inhibition with a very low Ki (which means that maximum inhibition even at very low concentrations was the best fit). But even if such a low value for Ki is the best fit, a significantly larger value almost produces the same results (as max perpetrator exposure is likely >> 3e-4 nM).

And. if a Ki value should not be allowed to become negative is nothing the PI can judge.

Maybe increasing contribution of CYP1A2 will allow the PI to obtain a result with higher confidence.

Yuri05 commented 4 years ago

@tobiasK2001

The (still maybe a bit naive ) user might think that the Identification problem is solved by moving the boundary. But is isn't.

I disagree. I don't see any reason why the PI problem cannot be considered as solved in this case.

He just had a quick look at the covariance matrix and the CI there and he overlooks that the point estimate has a very small value and the CI is quite large. He also does not worry that that it includes zero and negative values, which is not appropriate in that case.

In this case the CI interval can be considered as [0 .. p+θ] instead of [p-θ .. p+θ]. Which is indeed not quite accurate and I agree that in such a case we could show a warning as well (e.g. something like you proposed: "CI is larger than estimated value and includes zero. PI result might be not in a local minimum CI calculation (and all related results) might be innacurate.")

Alternatively we could consider CI estimation for bounded parameters. I am not good enough in statistics to say if the error we make using CI estimation for unbounded parameters is high enough to justify the effort though.

tobiasK2001 commented 4 years ago

Yes I would favour a message like this: "CI is larger than estimated value and includes zero. CI calculation (and all related results) might be inaccurate."

Regarding the CI estimation for bounded parameters: I have to confess that my statistics is not deep enough to judge on that. I talked with a statistician about that problem and his thinking went to the direction that the data available does not really support the estimation of that parameter and it#s uncertainty might impact the estimation of other parameters as well. So a probably better solution would be to leave that parameter out (that's in any case easier than trying to deal with CI estimation of asymmetric parameter distributions). Or one might think of implementing structural alternatives to circumvent this identifiability problems. It would be good if PK-sim points him in this direction with a small warning like above

Thank you all for your input and Ideas!