Closed lennybronner closed 1 year ago
@lennybronner thank you so much for all of this! 🎉 🙌🏻
Can you say a bit about what the turnout factor is? 😅
@lennybronner thank you so much for all of this! 🎉 🙌🏻
Can you say a bit about what the turnout factor is? 😅
Or actually, sorry, I think I see what you're doing. Even though the end result we're sharing with the world is the predicted margin
, we still need to predict turnout
in order to predict margin
. But not all of our data sets include results_turnout
, so you're checking to make sure it exists and if not, sum across the (applicable) results_
columns we do have. Do I have that right-ish? 😅
@lennybronner thank you so much for all of this! 🎉 🙌🏻 Can you say a bit about what the turnout factor is? 😅
Or actually, sorry, I think I see what you're doing. Even though the end result we're sharing with the world is the predicted
margin
, we still need to predictturnout
in order to predictmargin
. But not all of our data sets includeresults_turnout
, so you're checking to make sure it exists and if not, sum across the (applicable)results_
columns we do have. Do I have that right-ish? 😅
Yeah, we need to predict turnout in order to get the normalization constant for normalized margin, since we need to go back and forth between unnormalized and normalized margin to move from county predictions to state predictions.
Turnout factor is basically just the ratio of turnout in this election to turnout in last election. In the margin model it's part of what we're estimating. But we also drop units whose turnout factor is greater than or less than some constant. We're basically assuming that if turnout in some county is only 20% of it's last elections turnout (or greater than 200% of last election's turnout) that our results provider either made a mistake (or that we accidentally mismatched precincts), so we drop that county in our model. We can adjust the constants (20/200%) through parameters in the model so in case that there is a super low/high turnout election we don't accidentally drop too many units.
@lennybronner thank you so much for all of this! 🎉 🙌🏻 Can you say a bit about what the turnout factor is? 😅
Or actually, sorry, I think I see what you're doing. Even though the end result we're sharing with the world is the predicted
margin
, we still need to predictturnout
in order to predictmargin
. But not all of our data sets includeresults_turnout
, so you're checking to make sure it exists and if not, sum across the (applicable)results_
columns we do have. Do I have that right-ish? 😅Yeah, we need to predict turnout in order to get the normalization constant for normalized margin, since we need to go back and forth between unnormalized and normalized margin to move from county predictions to state predictions.
Turnout factor is basically just the ratio of turnout in this election to turnout in last election. In the margin model it's part of what we're estimating. But we also drop units whose turnout factor is greater than or less than some constant. We're basically assuming that if turnout in some county is only 20% of it's last elections turnout (or greater than 200% of last election's turnout) that our results provider either made a mistake (or that we accidentally mismatched precincts), so we drop that county in our model. We can adjust the constants (20/200%) through parameters in the model so in case that there is a super low/high turnout election we don't accidentally drop too many units.
Got it!! That's awesome 🎉
What about dropping units whose turnout factors are outliers against the other units? That way, on the off chance the entire state doesn't vote (or does vote), there's no risk of dropping almost every unit in the state. If you've done some evaluation to come up with these constants, that's fine, and I know for now we're primarily interested in big (top-of-the-) ticket races anyway where this is less likely to occur. Just a thought 🤷🏻♀️ 😄
@lennybronner thank you so much for all of this! 🎉 🙌🏻 Can you say a bit about what the turnout factor is? 😅
Or actually, sorry, I think I see what you're doing. Even though the end result we're sharing with the world is the predicted
margin
, we still need to predictturnout
in order to predictmargin
. But not all of our data sets includeresults_turnout
, so you're checking to make sure it exists and if not, sum across the (applicable)results_
columns we do have. Do I have that right-ish? 😅Yeah, we need to predict turnout in order to get the normalization constant for normalized margin, since we need to go back and forth between unnormalized and normalized margin to move from county predictions to state predictions. Turnout factor is basically just the ratio of turnout in this election to turnout in last election. In the margin model it's part of what we're estimating. But we also drop units whose turnout factor is greater than or less than some constant. We're basically assuming that if turnout in some county is only 20% of it's last elections turnout (or greater than 200% of last election's turnout) that our results provider either made a mistake (or that we accidentally mismatched precincts), so we drop that county in our model. We can adjust the constants (20/200%) through parameters in the model so in case that there is a super low/high turnout election we don't accidentally drop too many units.
Got it!! That's awesome 🎉
What about dropping units whose turnout factors are outliers against the other units? That way, on the off chance the entire state doesn't vote (or does vote), there's no risk of dropping almost every unit in the state. If you've done some evaluation to come up with these constants, that's fine, and I know for now we're primarily interested in big (top-of-the-) ticket races anyway where this is less likely to occur. Just a thought 🤷🏻♀️ 😄
That's a really good idea! Though I guess would necessitate a bit more computation? Do you mind adding a future ticket to implement?
@lennybronner thank you so much for all of this! 🎉 🙌🏻 Can you say a bit about what the turnout factor is? 😅
Or actually, sorry, I think I see what you're doing. Even though the end result we're sharing with the world is the predicted
margin
, we still need to predictturnout
in order to predictmargin
. But not all of our data sets includeresults_turnout
, so you're checking to make sure it exists and if not, sum across the (applicable)results_
columns we do have. Do I have that right-ish? 😅Yeah, we need to predict turnout in order to get the normalization constant for normalized margin, since we need to go back and forth between unnormalized and normalized margin to move from county predictions to state predictions. Turnout factor is basically just the ratio of turnout in this election to turnout in last election. In the margin model it's part of what we're estimating. But we also drop units whose turnout factor is greater than or less than some constant. We're basically assuming that if turnout in some county is only 20% of it's last elections turnout (or greater than 200% of last election's turnout) that our results provider either made a mistake (or that we accidentally mismatched precincts), so we drop that county in our model. We can adjust the constants (20/200%) through parameters in the model so in case that there is a super low/high turnout election we don't accidentally drop too many units.
Got it!! That's awesome 🎉 What about dropping units whose turnout factors are outliers against the other units? That way, on the off chance the entire state doesn't vote (or does vote), there's no risk of dropping almost every unit in the state. If you've done some evaluation to come up with these constants, that's fine, and I know for now we're primarily interested in big (top-of-the-) ticket races anyway where this is less likely to occur. Just a thought 🤷🏻♀️ 😄
That's a really good idea! Though I guess would necessitate a bit more computation? Do you mind adding a future ticket to implement?
Sure! Thanks! 😄 🎉 The ticket is here: https://arcpublishing.atlassian.net/browse/ELEX-3298
Description
This PR moves over parts of the changes we are making for the bootstrap election model PR in order to make reviewing that one easier. It make the changes necessary to the old
ConformalElectionModel
to work with the updates made to elex-solver in this PR and it makes small tweaks to the estimandizer to prepare it for multiple estimands being generated at once. It also updates unit tests accordingly.Jira Ticket
https://arcpublishing.atlassian.net/browse/ELEX-2771
Test Steps
tox
also