Closed jinyuan-jia closed 2 years ago
no idea yet exact power functions for proportions are not monotonic, but I guess (!) that shouldn't be the case here.
"beta" method just calls stats.beta.ppf and isf. So if there is a bug, then it is most likely some numerical problems in scipy special for some corner/extreme cases.
adjusted case, non-monotonicity starts at alpha=1e-5
count_list=np.arange(1, 20) #[0,10,100,1000]
alpha=1e-11
alpha=1e-5
res = []
for count in count_list:
print(min(max(count, 1e-15), 100000-1e-15))
low, upp = proportion_confint(min(max(count, 1e-15), 100000-1e-15), 100000, alpha=alpha, method="beta")
print((low, upp))
res.append([low, upp])
res = np.asarray(res)
print("\n mindiff", np.diff(res, axis=0).min(0))
mindiff [ 3.16063160e-08 -3.50507594e-02]
upp is shifted upwards for count is 3, 4 or 5
Thanks for your time. I checked the source code and I guess the issue comes from scipy.stats.beta.isf . Again, thanks for your response.
I opened an issue for scipy.special
Thanks josef-pkt.
looks like scipy 1.7 should have this problem fixed underlying functions for beta are now taken from Boost.
needs checking, then we can close this issue
Thank you josef-ptk. Let me close the issue.
Describe the bug
Hello, I met a bug when I use function statsmodels.stats.proportion.proportion_confint to estimate confidence interval. I have attached the code to reproduce the issue.
Code Sample, a copy-pastable example if possible
from statsmodels.stats.proportion import proportion_confint count_list=[0,10,100,1000] for count in count_list: print(proportion_confint(min(max(count, 1e-15), 100000-1e-15), 100000, alpha=1e-11, method="beta"))
Expected Output
The output is as follows (lower bound, upper bound): (0.0, 0.00026020034420181726) (3.4637672992749764e-06, 0.059396009405412224) (0.00046224918454042924, 0.0018509801594462916) (0.008004250120330274, 0.012302551885285284)
As the result shows, when count=10, the upper bound is 0.059. However, when count=100, the upper bound is 0.00185 (smaller than 0.059) which is undesired. In particular, the upper bound should increase with the increasing of count.
Output of
import statsmodels.api as sm; sm.show_versions()