cjph8914 / 2020_benfords

368 stars 82 forks source link

Analyze same data with different base values #6

Open zigster64 opened 3 years ago

zigster64 commented 3 years ago

Given that with Benford's law :

Would be interesting to cross reference the same input data sets with using bases other than 10

something like np.logX(1 + 1/digit) * N

where X is a range of numbers say [2..16]

So if you generated a range of graphs across the same data with different base sizes and eyeballed the result, that may lead to higher confirmation that the given data is abnormal if its also abnormal for the majority of numeric bases.

Alternatively, it may show what appears as an apparent anomaly is maybe not as bad as it looks. maybe ?

This is super interesting, you got me reading up on stats all over again :) Thanks.

dogweather commented 3 years ago

Just use the second digit instead of the first.

chavenor commented 3 years ago

https://www.fec.gov/introduction-campaign-finance/how-to-research-public-records/combined-federalstate-disclosure-and-election-directory/.