j3camero / canada-election-forecast

Code that forecasts the results for each electoral district (riding) in the Canadian 2015 federal election
Apache License 2.0
28 stars 2 forks source link

Add per-riding polling data #5

Closed j3camero closed 8 years ago

j3camero commented 8 years ago

The per-riding forecasts are most useful when there's no recent polling data available for a specific riding. Closer to the election a lot of this data starts to become available. This can be added to the model to get more accurate forecasts.

j3camero commented 8 years ago

Here's a decent source of riding polls

https://en.wikipedia.org/wiki/Opinion_polling_in_the_Canadian_federal_election,_2015_by_constituency

j3camero commented 8 years ago

Follow up with Sheila once this is done.

skosch commented 8 years ago

THIS is where you should get your data from, at least for the most crucial ridings. The polls are just a few days old. Much has changed since 2011.

j3camero commented 8 years ago

Thanks for the source! What a coincidence I was just sitting down with a case of Pepsi to get this one done. Hopefully I'll have an update in a few hours.

j3camero commented 8 years ago

Progress update. https://github.com/j3camero/canada-election-forecast/commit/2c93a7b5c2b91bde440f81953a8003b543852182

Added the ability for the model to load a series of historical regional poll averages, instead of just the current poll averages. This will let the model make sense of riding-specific polls, which have a variety of different dates.

The model doesn't yet deal with riding-specific polls, but this is a first step.

skosch commented 8 years ago

Cool! I probably don't have to tell you this, but you should hardcode these numbers until you have your model ready. Your site is being shared widely on social media right now (that's how I came across it), so it's pretty critical that your numbers are as fresh as possible so you don't end up accidentally making things worse!

(Your recommendation for Saulte Ste. Marie, for example, is to vote NDP. Apparently things have turned around and you should now be recommending the Liberal candidate. Cambridge and Saint John/Rothesay are the same story.)

j3camero commented 8 years ago

Holy traffic batman, you're right. Any idea what's happening that the traffic is about 5x the usual today? I'll do what I can to finish cranking this out today. Agreed that it's shameful to be giving bad advice in even a single riding.

j3camero commented 8 years ago

Update John once change is live.

skosch commented 8 years ago

Not sure what caused your traffic spike, but I'm happy to hear the site is gaining popularity :)

(Unless it's actually the harperbots slowly ramping up a DDoS ... haha)

j3camero commented 8 years ago

More progress https://github.com/j3camero/canada-election-forecast/commit/188a87cb0aa4599cf24fff7e3a2af92f806287ff?diff=unified

This change eliminates the conceptual difference between 2011 elections results vs any other kind of poll. They're treated the same now. It's another prerequisite for having a model that can deal with riding-specific polls.

j3camero commented 8 years ago

Progress update https://github.com/j3camero/canada-election-forecast/commit/46729ad536bd3fd81ee70b91d6e6ef9a62d1c95c

The beginning of a model that can handle riding-specific poll data.

j3camero commented 8 years ago

Progress update https://github.com/j3camero/canada-election-forecast/commit/1bd8c7ef616b902859087412d29b61432de03eeb

Sometimes there are multiple sources of polling data for one riding. I've arrived at a reasonable formula for weighting them. The weight of each poll is:

weight = sample_size * (0.25 ^ age_in_years)

Where does 0.25 come from? Not really sure, just seems like a reasonable constant. I chose it to give a roughly 1:5 ratio between a recent poll with a typical sample size vs the 2011 election results. Happily enough this formula gives roughly equal weight to recent polls with typical sample sizes vs 2012 by-election results. Certainly this could be tuned further in the future, but this will do for now.

j3camero commented 8 years ago

Progress update https://github.com/j3camero/canada-election-forecast/commit/5f0d6d0ceb69c53a307e2890773f59ea42336c5f

The riding poll model is now combined with the existing proportional swing model. We're really close now. Just gotta do a bit of validation to make sure the new projections make sense before pushing them live to the site. It's gonna happen today.

j3camero commented 8 years ago

Done. The new model that takes into account riding-specific polling data is live on the website.

I'm sure that this model update will come with new issues. Those can be filed as separate specific bugs.

skosch commented 8 years ago

Wonderful job. Thanks for your efforts! :)

j3camero commented 8 years ago

Thanks for your suggestions and moral support, Sebastian. It's what keeps me hacking at 3 AM when shit needs to get done :-)