TheEconomist / us-potus-model

Code for a dynamic multilevel Bayesian model to predict US presidential elections. Written in R and Stan.
https://projects.economist.com/us-2020-forecast/president
MIT License
1.24k stars 188 forks source link

The website appears to calculate senate ties incorrectly - and therefore also the total win probability #20

Closed niclasmattsson closed 4 years ago

niclasmattsson commented 4 years ago

At the time of this writing, the website reports that Biden has a 91% chance to win the presidency and Democrats a 76% chance to win the senate. Scrolling down and hovering over the histogram of senate simulations, the website reports the Democrats to have 74% chance of winning at least 51 seats. When hovering over the gray bar just to the right, the text is "There is a 10% chance the senate will be split evenly, with an 18% chance a Democrat breaks the tie."

I assume that last phrase refers to the vice president breaking ties in the senate. But then given Biden's lead, surely it should be the Democrats who have the advantage to win senate ties? My guess is that someone inverted that probability and the correct number should be 82% in favor of the Democrats, and therefore the Democrats should have a 74% + 10%*82% ≈ 82% chance to win the senate (instead of the reported 76%).

Or am I misunderstanding something?

niclasmattsson commented 4 years ago

I'd really like at least a comment on this since the suspected error is still live on economist.com, so paging @elliottmorris

elliottmorris commented 4 years ago

Hi @niclasmattsson. My colleague @cooberp is responsible for the Senate model so I am tagging him here.

The issue is not with a typo or inversion, but with how we have the model set up to handle ties. The reason for the low p(Dems break tie) is two-fold.

First, we find that the scenarios in which Democrats only win a tie in the Senate are a systematic underperformance from their current forecast. That means that the chance of Republicans winning the presidency is also higher under those circumstances; Since people tend to vote for the Senate and presidency together, losing ground in one is similar to losing ground in another.

Second, there is a bug in how we've set up the actual calculations. Our model uses the implied House popular vote in certain Senate outcomes to calculate a hypothetical presidential vote. The problem is that the historical relationship between House and presidential vote is weak, so the model we've trained for this punishes Democrats in the presidential race by hedging toward 50-50.

@cooberp should be fixing this issue in the coming days. I am closing the issue either way as this is our repository for the presidential election model, not the Senate or House modes.

Elliott

niclasmattsson commented 4 years ago

Thanks @elliottmorris for a more than satisfactory explanation. Please consider open-sourcing the Senate and House models as well. This in itself would make them much more believable than the 538 models - and if they're as mathematically sound as your presidential model then they're likely technically superior as well. (Granted this is more a suspicion than a fact since the 538 team won't release their model code - but by the descriptions their models do seem more ad hoc than yours.)

cooberp commented 4 years ago

I’m the author of The Economist’s Congressional models. Honestly, I just haven’t fully thought through whether or how to make them open source. They’re more than 11,000 lines of R code (with a bit of C++), mostly but not fully commented, and with terrible indenting and 100-character variable names. And they make use of some precious, tediously hand-compiled datasets. I guess I’m just nervous about competitors free-riding on my hard work—but of course, I never could have built these models without the hard work of others (including competitors) who have made their data public. As soon as I fix the Senate tiebreak issue you mention and a few others and expand our plain-English methodology writeup, I’ll turn to working on the repository.