Closed javadba closed 3 years ago
thanks for reporting! I will look into it.
I can not reproduce the error. Can you tell me the version you are using?
import benfordslaw
print(benfordslaw.__version__)
Should be >= 1.0.3
I also included a boolean output in the latest release (1.0.3) using the key P_significant
. You can do the "sorry, nope." now.
I get the following results when using your code:
import numpy as np
from benfordslaw import benfordslaw
bl = benfordslaw(alpha=0.05)
x = np.linspace(0,1000,1001)
x = np.append(x,[1,1,1,1,1,1,])
isben2 = bl.fit(x)
print(f"isben2 {isben2}")
print(f"P_significant: {isben2['P_significant']}")
if not isben2['P_significant']:
print("sorry, nope.")
[benfordslaw] >Analyzing digit position: [1]
[benfordslaw] >[chi2] Anomaly detected! P=5.47798e-80, Tstat=393.145
isben2 {'P': 5.4779835775992096e-80, 't': 393.14541596537117, 'percentage_emp': array([[ 1. , 11.72962227],
[ 2. , 11.03379722],
[ 3. , 11.03379722],
[ 4. , 11.03379722],
[ 5. , 11.03379722],
[ 6. , 11.03379722],
[ 7. , 11.03379722],
[ 8. , 11.03379722],
[ 9. , 11.03379722]])}
P_significant: True
import benfordslaw
...: print(benfordslaw.__version__)
1.0.2
I pip3 install
'ed from pypi 2 days ago. So can you update pypi?
update with:
pip install -U benfordslaw
Did the update:
In [7]: import benfordslaw
...: print(benfordslaw.__version__)
1.0.3
But I get the same original result
ValueError: For each axis slice, the sum of the observed frequencies must agree with the sum of the expected frequencies to a relative tolerance of 1e-08, but the percent differences are: 0.0009950248756218905
Well this one is tricky apparently. More information about it can be found here. It is a feature, not a bug.
You can either change your input slightly (remove one of the 1s)
x = np.linspace(0,1000,1001)
x = np.append(x,[1,1,1,1,1])
Or you can use another method:
bl = benfordslaw(alpha=0.05, method='ks')
I created a new update that will inform better about what to do in such case.
pip install -U benfordslaw
I created a small modification by removing the rounding of the expected counts and keeping the values exact. Therefore, it does not throw this error is anymore!
update with:
pip install -U benfordslaw
I am trying to use this library more or less as either a binary indicator of "benford or not" or a probability indicator of same. So any distribution should be possible to send into it. If the distribution is weird - then say "sorry, nope."
Instead consider:
Instead of a "nope" we get:
Note that even just using
x
without the extranp.append()
results in the same error. So .. what does this mean? Should I add my own code to catch that exception and then say "nope" ? The problem with that is we don't get any probability and also it is unclear whether that exception were due to some other unexplained data problem.fyi the entire stacktrace is