cjph8914 / 2020_benfords

369 stars 83 forks source link

Election Fraud Data Check of Dr. Shiva #38

Open chavenor opened 3 years ago

chavenor commented 3 years ago

We have made requests to get the data sets to check the following video. Once, if we get the data I will post it here.

See the video below. https://www.pscp.tv/w/1BdGYYjgkgQGX

MechanicalTim commented 3 years ago

Agree that it makes sense for this to have been a new issue from the start. But I would also suggest that if folks are going to read this issue, they start from the initial posting of this video in this comment from the prior thread.

Something I find implausible (or maybe just don't understand) about their claim: Wouldn't it imply that they would need to manipulate just the right about of votes in just the right districts, in order to result in this linear pattern that they claim is evidence of fraud? That would be an impressive scheme, indeed.

ghost commented 3 years ago
  1. He spends one hour showing three slides that could just as well be interpreted like this: Suburban republicans don't like Trump, inner city Democrats like Trump. If these groups don't agree on politics, then why should they agree on Trump? It's a very interesting pattern, but these three slides are no proof of cheating. Note that the selected x-axis range on the Detroit chart is different from in the two other charts so that the trend seems different. I won't waste my time on that guy, but if you want to check you could look at other similar counties and see if they share the same pattern or not.

  2. In addition, I looked into this claim from Trump supporters on the Pennsylvania time series:

https://thedonald.win/p/11Q8O0gyAf/huge-evidence-glitches-all-over-/ https://thedonald.win/p/11Q8O0ggvH/election-data-csv-files-from-jso/c/

I created some charts (and didn't save them because I was annoyed to find that it was all baseless claims). Again it's interesting, but probably a nothing burger: The data at any given time is extrapolated from percentage of total votes, and this percentage only uses three digits, so there is a rounding error. Candidates lose a small number of votes in their count at different times, and the claim is that Trump lost more votes than Biden. However, Biden actually lost more votes than Trump, but in the early stages we can see what are probably two typos for each candidate which then are probably corrected back (Biden lost more votes than Trump with these two corrections). If we disregard these probable typos+corrections, then what's left is an interesting "anomaly" towards the end of the time series. Both Biden and Trump get a small number of votes subtracted at several instances. This could, however, simply be due to rounding errors. Trump's vote is adjusted down more times than Biden's in the later stages, but this is probably because Biden overall gets more votes than Trump at that stage (it's probably in the mail vote counting stage). Moreover, the reason why the number of deductions escalates in late counting, is probably that the absolute rounding error becomes larger when more total votes have been counted.

I thus fail to see how any of what I have seen in (1) and (2) is indicative of election fraud. It's possible to look more into both issues, but personally I feel that it's a waste of time. I'm tired of the never-ending flow of fantastic claims that, on scrutiny, turn out to be nothing, every. single. time.

charlesmartin14 commented 3 years ago

@testes-t These claims will never stop unless enough people are willing to demand transparency and check the data

ghost commented 3 years ago

@charlesmartin14 Say that I uploaded the charts I made to the "thedonald.win" thread on the data I spent my precious time analyzing, a thread which is now going through the roof. My sober analysis would probably be downvoated into oblivion because it concludes that it's not evidence of election fraud, which means that only few people would see my analysis. Social media like this forms hysterical confirmation bubbles in which outrageous claims are perpetually upvoted. These people ended up placing mob behaviour above objective scrutiny, which means that these claims will never stop, full stop.

MechanicalTim commented 3 years ago

One also has to consider the opportunity cost of one's time. These things can be endless diversions from more fruitful work.

charlesmartin14 commented 3 years ago

@testes-t @MechanicalTim I hear you, it is time-consuming. I think the group here has made good progress. This issue isn't going away anytime soon. People may be arguing this for years to come. Eventually, I think we should summarize the findings (in a README, or maybe even Jupyter Book). Good work always speaks for itself.

And hopefully, everyone has learned something that might even be useful.

charlesmartin14 commented 3 years ago

@testes-t @MechanicalTim That said, it might be more fruitful to look for anomalies in places where there are actual allegations of ballot tampering and other fraud, and verified in signed affidavits.

MechanicalTim commented 3 years ago

Or, if one is reasonably convinced that we indeed had a fair election, one could ignore these (weak, in my opinion) claims of fraud completely, and instead focus one's attention on any number of other worthy efforts to preserve U.S. democracy, instead of tacitly participating in this doubt-raising exercise.

I realize that not everyone participating in these issues is in that camp, but it is unnecessarily narrow-minded to say, "Instead of fighting against these particular weak claims of voter fraud, you could instead look at these other weak claims of voter fraud."

EDIT: See, for example: this NY Times article about contacting election officials in both parties, and consistent agreement that there have been no irregularities, and Fox News cutting away from such claims by Trump's spokeswoman, and stating that these were baseless claims.

charlesmartin14 commented 3 years ago

@MechanicalTim I meant this contrast to, say, trying to debunk random claims on the internet,

The current Trump legal claims against the election results are not necessarily based on fraud, The claim in PA, as I currently understand it, is about equal treatment under the law and is based on the precedent(s) set in Bush V Gore. I think the issue of fairness will most likely be decided by the courts.

Fraud is a different issue.

So is transparency.

I personally don't care what some NYT article says, or some third rate academic, or whats on some nut on 4chan or Reddit thinks. I have my own interests in this as I am sure many others do.

To that end, IMHO, It would be helpful to collect and have access to all the voting data in one place so people can ask and answer these questions themselves. And speed things up considerably

MechanicalTim commented 3 years ago

@chavenor wrote:

@MechanicalTim I do not believe that is what they are saying - I took - Straight ticket as assuming that all Republicans vote for Trump and as a precinct gets more Republican you would expect that the number would be at 0% not down -25%.

Ah, I see what you mean, and agree with your interpretation. But it turns out that this only reinforces the point. Shiva sets the expectation that the straight horizontal line (in his plot) is the "norm", and any deviation from that is suspicious. But I don't think that is the case.

For example, if you find a county where, say, 70% of Republicans vote for Trump, and 10% of Democrats do so, then as the precinct Republican vote share does this:

repub_prec_fraction = [20; 30; 40; 50; 60; 70; 80];

then the Trump vote share look like this:

trump_share = [22; 28; 34; 40; 46; 52; 58]

and my plot remains nearly unchanged.

This strikes me as quite plausible, and it would be relatively easy to find a single county that behaved like this. Ironically, the less popular Trump is in a county, the more you'll see this purported "deviation".

frycast commented 3 years ago

Basically, by putting Republican% - Trump% on the y-axis, and Republican% on the x-axis, the negative linear slope is just telling us that Trump% is on average a constant multiple of the Republican%, where the constant is something less than 1.

It's like, trick number 1 from the book on how to make a positive slope look negative.

chavenor commented 3 years ago

@testes-t I would say that there is an absolute abnormality in that graph. I'll wait for the data to duplicate results. #2 issues yes it's frustrating but that is no reason to stop looking for the truth. If you are burning out, sit back, get a cup of tea and recenter we will press on. We're happy to have you back anytime. :)

@frycast and @MechanicalTim I read both your posts. I'm not past the point if the data is real and I do not like going down hypotheticals but here we go...

Here is how I view the data.
A segmented population of Republicans.
If they vote Republican the data point should be at 0%.
Above 0% means more people voted outside of the Republicans label in those precincts.
Regardless of the density of % of Republicans the graph should hover around 0% slightly above or slightly below. I'll explain it.

The presenter shows his straight-line convergence of a race that he himself saw the same odd happenings when the votes were not hand-counted.

If we simply look at the enthusiasm charts from 2016 and note that in 2020 there was equal or more enthusiasm for (links below) Trump there is no possible way we would see a straight line reduction especially when we saw a positive position in the less Republican areas. The last part is key, Republicans outperformed in areas where there were less % of Republicans.

A vote swap as described in the video actually explains what we came across earlier with the D turnout and the overwhelming support for B/H - let's put this into perspective there were more votes cast for Joe Biden than Barack Obama while there was an overall lower Democratic turnout and massive Republican turnout; this massive turnout decided to vote less likely for their candidate in traditional Republican strongholds but still registered for off the chart enthusiasm for Trump. An election for the ages indeed.

See 2016 here. I couldn't find anything for 2020. https://www.pewresearch.org/politics/2018/08/09/for-most-trump-voters-very-warm-feelings-for-him-endured/2-2-2/ https://www.pewresearch.org/politics/2018/08/09/for-most-trump-voters-very-warm-feelings-for-him-endured/4-21/

MechanicalTim commented 3 years ago

@chavenor wrote:

Regardless of the density of % of Republicans the graph should hover around 0% slightly above or slightly below. I'll explain it.

This statement is demonstrably false. Even if Trump takes 90% of Rep votes, and 10% of Dem votes (on average), the Shiva plot will slope downward (on average). Here is the formula for the Trump share, as a function of the precinct's Rep fraction:

trump_share = 0.9 repub_precinct_percentage + 0.1 (100 - repub_precinct_percentage);

Here is the resulting graph:

Shiva nonsense

chavenor commented 3 years ago

@chavenor wrote:

Regardless of the density of % of Republicans the graph should hover around 0% slightly above or slightly below. I'll explain it.

This statement is demonstrably false. Even if Trump takes 90% of Rep votes, and 10% of Dem votes (on average), the Shiva plot will slope downward (on average). Here is the formula for the Trump share, as a function of the precinct's Rep fraction:

trump_share = 0.9 repub_precinct_percentage + 0.1 (100 - repub_precinct_percentage);

Here is the resulting graph:

Shiva nonsense

@MechanicalTim you are correct with your information regarding a full look at all votes. This is not what is being pointed out in the video. They are only looking at specifically Replucian votes, not the whole lot. Below is what I'm taking away from the presentation.

Per the presentation, we do not see the linear dip start until we have precincts with 20% republican support and that is where the votes look to be getting flipped. Again we do not know if this is true in my opinion this makes no sense. I just noticed that I didn't spell straight correctly - sorry about this.

table

graph

RexRookie commented 3 years ago

To plot the graphs, the calculation at 20:30 of the video is given as (with RSP= Replublican Straight Party votes, TICV= Trump Individual Candidate votes): X=RSP, Y= TICV - RSP= TICV - X. TICV and RSP should be correlated on average, so TCIV= aRSP+b on average, with some random deviations. Then Y=aRSP+b -RSP= (a-1)RSP+b= (a-1)X + b. Therefore the slope of Y vs X is a-1, a straight line going down any time a<1, just as shown at 45:47, as it should normally. Differences arise because of deviations in each precincts, but one should get a plot similar to those shown in the video. This plot arises with absolutely no algorithm involved nor falsification, just simple algebra. Correct numbers of votes are given exactly as they should by AP news at 30:52 as (TICVn1 + RSPn2) with n1 and n2 the number of votes in each category in the precinct. Here is an artificial example where RSP= 60% on average, and the correlation between RSP and TICV is 75% over all precincts.
Rplot

MechanicalTim commented 3 years ago

@chavenor wrote

They are only looking at specifically Replucian votes

At 21:05 in the video, Shiva states that they are plotting all Trump votes: Rep, Dem, and Independent.

chavenor commented 3 years ago

@MechanicalTim yes that is correct but they are comparing that to the expected party-line vote based on the % of republicans in that district. That is why in the precincts with less than 20% it is odd to see Trump doing better than 0% meaning he was more favorable there with non-R voters.

MechanicalTim commented 3 years ago

Why is that odd? Suppose there is a precinct with 1000 voters, and very few Republicans, say 100 of them. (That's 10% on the x-axis.) Using my assumption above that 90% of those voters cast their ballot for Trump, that's 90 Trump votes from Republicans.

Out of the 900 remaining (mostly Dem, but of course some independent) voters, if more than 90 vote for Trump (or maybe it needs to be 100?), then the Shiva plot tips to "better than 0%". And just 5 more voters puts you at +5%. Again, this seems perfectly plausible to me.

The "perfect storm" for this would be counties/precincts with a high percentage of independent voters and low percentage of Republicans. If roughly half the independents vote Trump, then their Trump votes could easily outnumber Republican Trump votes.

When you can cherry-pick just one or two counties to make your plot, this all seems to be a very straightforward explanation.

chavenor commented 3 years ago

@MechanicalTim anything above 0% is interesting because it means Trump is carrying more votes outside of his party and that advantage completely drys up as you move into larger republican strongholds.

We need more data and I've made a request for all the vote data for all precincts for the USA. I also reached out to the GOP and DNC to get voter registrations in the same format.

I will return with the data and the crunching. I will be stepping back from the discussion as I do not think anything more can come from it.

Back soon.

homage-admin commented 3 years ago

@testes-t I agree this Benford stuff is bunk. Check this out: https://github.com/stunnashades/ga-discrepancies/blob/main/report.md . Not a smoking gun, but interesting

chavenor commented 3 years ago

@testes-t I agree this Benford stuff is bunk. Check this out: https://github.com/stunnashades/ga-discrepancies/blob/main/report.md . Not a smoking gun, but interesting

Damit you guys have to pull me back in. @stunnashades do I have this right?

Everything below can't be true. I'm super curious to see how Biden found his path to victory.

Ok I'm going to sip tea back in time.

ghost commented 3 years ago

@testes-t I agree this Benford stuff is bunk. Check this out: https://github.com/stunnashades/ga-discrepancies/blob/main/report.md . Not a smoking gun, but interesting

In the charts at the bottom we can see what I have previously pointed out for Allegheny and Chicago: There are two clusters. Questions:

  1. Why are there two clusters?
  2. One of the clusters is strangely consistent at about 93-95% Biden vote regardless of turnout. Why?
  3. Why isn't there a single data point at 97-98% for 2020 when the cluster at 93-95% is just next to it?
snex commented 3 years ago

Can people please stop abusing the Issues section of a github project to try to discuss things that are not related to that project? If people are posting false information on a website, go debate with them on that website or open issues on the github hosting that code. This project has nothing to do with that.

homage-admin commented 3 years ago

@stunnashades do I have this right?

  • Previous 2016 low Trump energy precincts show favoritism to Trump in early voting.
  • While higher 2016 support for Trump early voting dropped by 10%-30%.

Way I'd say it: Trump 2016 had a lot of counties with both high turnout and favoritism. In 2020, those were gone.

Now there's two possibilities. Maybe they had big-orange energy in 2016, and got let down. Or maybe they painted a big fat target on themselves, and there was some f***ky-f***ky this year.

We still don't know - but we might know where to look.

chavenor commented 3 years ago

@frycast and @MechanicalTim I got clarity on the inputs. Your observation of "Straight Vote Republican" vs what I thought was "% of Republicans" in the district was correct. the anomaly alleged isn't such. You are correct.