cjph8914 / 2020_benfords

369 stars 83 forks source link

Add a Way to Simulate an Election With Given Parameters #39

Open snex opened 3 years ago

snex commented 3 years ago

It would be nice to be able to run simulated elections and then see if the results of those elections (which we know are fair) conform to Benford's Law whether 1st digit or 2nd digit. I have a quick and dirty start here https://github.com/snex/election_results_benford/blob/master/sim.rb

charlesmartin14 commented 3 years ago

@snex. See work by Mebane--this is exactly what he does http://www-personal.umich.edu/~wmebane/pm19.pdf

https://www.youtube.com/watch?v=zkx_eO0PvXU

Please note: random normal / choice data is not Benford so this has to be done with some care

snex commented 3 years ago

Where can we find his code? Feel free to issue a pull request on my repo correcting what you believe needs to be corrected.

edit:

So I found his code here: https://github.com/UMeforensics/eforensics_public

I am playing around with it, and it seems like it does not support variable size precincts ("election units" in his terminology). Even so, I keep getting 2BL chi-squared values between 20 and 40. Biden's numbers for 2 counties in particular, Allegheny PA and DuPage IL, are significantly higher than this.

Correction: From reading the code, it seems like election unit sizes are hard coded to be a random number between 500 and 1000. This does not seem very realistic to me, as we can see from the raw datasets that they can vary from as small as 4 to well over 2000, and it seems entirely plausible that precinct sizes are not uniformly random between min and max, but rather are crafted by politicians to at least be close to a normal distribution around some mean, as my (admittedly primitive) simulation supports.