markovmodel / msmtools

Tools for estimating and analyzing Markov state models
GNU Lesser General Public License v3.0
40 stars 26 forks source link

[util.statistics.confidence_interval]: handling constant data #67

Closed franknoe closed 8 years ago

franknoe commented 8 years ago

Algorithmic fix for #66. However I am not sure what is the most meaningful behavior here. Credible intervals are defined as follows: if (l,r) is a probability p credible interval, then the probability that a sample x fulfills in l <= x <= r is p. So if our data are constant, then only 0% and 100% credible intervals exist.

If we want to be hardcore, we should raise a ValueError when calling this function with values of p different from 0 or 1. That's not very practical though, because for multidimensional data it happens often that some dimensions are constant, and then we would like this function still to be applicable. I now chose to just return the constant value as l and r irrespective of p, but to raise a warning. It's debatable whether that is the best solution - perhaps even the warning is too annoying?

@trendelkampschroer , can you comment or merge if you are happy with this solution? If acceptable, please mirror this change in pyemma.utils.statistics in your recent PR.

Fixes #66