cjph8914 / 2020_benfords

369 stars 83 forks source link

Breakdown by vote type #32

Open snex opened 3 years ago

snex commented 3 years ago

I am doing my own similar analysis on this, and I've noticed that if you break results down by vote type, by far the most non-conforming dataset is the Joe Biden ELECTION DAY results, rather than absentee or mail-in results. Please add a breakdown by vote type as well as by total. I have uploaded my datasets and code here: https://github.com/snex/election_results_benford

ghost commented 3 years ago

Any diagrams? Are you referring to first-digit or second-digit Benford?

snex commented 3 years ago

All of my analysis has been text-based. I looked at both first digit and second digit.

charlesmartin14 commented 3 years ago

Interesting

Please see my comments on issue #31 on this issue

ghost commented 3 years ago

I looked into it. At first glance, election day results in Allegheny do not seem to deviate from what one would expect to see:

image image

ghost commented 3 years ago

Absentee votes are optically slightly more off in the second digit (someone looked into it here: https://github.com/cjph8914/2020_benfords/issues/27 - I get a lower number of zeros than the other guy), whereas the last digit seems OK:

image image

snex commented 3 years ago

You will not be able to immediately see problems in the 2nd digit by visualizations, because 2nd digit Benford is much closer to uniform. You will need to do a conformity test. I get a chi-squared value for Biden Election Day votes of 72.354. That's massive. And no - it's not due to abysmal turnout. He received 1322 Election Day votes, compared to Trump's 1237. The absentees and totals come in at 18.143 and 16.555, respectively.

ghost commented 3 years ago

Could you post some charts? I don't know how to handle the files on your site.

snex commented 3 years ago

.rb files are Ruby script files. If you have Ruby installed, all you do is run them and they'll spit out a data structure showing each candidate, vote type, and digit counts for them all, along with the chi-squared test stats. You can get Ruby at https://www.ruby-lang.org/en/

Making charts in Ruby isn't as nice as Python or R, but once you have the data you can shove it anywhere.