kuriwaki / cvr_harvard-mit_scripts

6 stars 1 forks source link

Missing candidates in precinct returns #328

Closed kuriwaki closed 2 months ago

kuriwaki commented 2 months ago

From our improved classification in #319 I found 14 counties where the issue is: there is no corresponding R/D/L entry in the Baltz et al. data. I took a quick look and all of them look like counties that would be releasable if we just fixed that error in the precinct returns data.

Here are the candidates and the code to reproduce this:

state county_name color2_h office district party_detailed candidate_c votes_c
FLORIDA DUVAL no entry in Baltz STATE HOUSE 014 DEMOCRAT ANGIE NIXON 62151
GEORGIA DOUGLAS 0 difference US HOUSE 013 LIBERTARIAN MARTIN COWEN 21
IDAHO BONNER not collected STATE HOUSE 01A DEMOCRAT GAIL BOLIN 7860
IDAHO BONNER not collected STATE HOUSE 01A REPUBLICAN HEATHER SCOTT 14911
IDAHO BONNER not collected STATE HOUSE 01B DEMOCRAT STEPHEN F HOWLETT 6859
IDAHO BONNER not collected STATE HOUSE 01B REPUBLICAN SAGE G DIXON 15422
IDAHO BONNER not collected STATE HOUSE 07A REPUBLICAN PRICILLA GIDDINGS 2876
IDAHO BONNER not collected STATE HOUSE 07B REPUBLICAN CHARLIE SHEPHERD 2801
NEVADA LINCOLN no entry in Baltz STATE HOUSE 036 REPUBLICAN GREGORY T HAFEN II 701
NEVADA NYE no entry in Baltz STATE HOUSE 036 REPUBLICAN GREGORY T HAFEN II 19852
NEW JERSEY CAMDEN red US HOUSE 001 DEMOCRAT PROGRESSIVE 1
NEW JERSEY ESSEX any < 1% mismatch US PRESIDENT FEDERAL REPUBLICAN KAREN BASS 1
NEW JERSEY ESSEX any < 1% mismatch US HOUSE 010 DEMOCRAT DONALD PAYNE JR 1
NEW JERSEY HUDSON any < 1% mismatch US HOUSE 009 LIBERTARIAN JOHN MIRRIONE 1
OHIO HANCOCK red STATE HOUSE 083 DEMOCRAT MARY E HARSHFIELD 1317
TENNESSEE LOUDON not collected US PRESIDENT FEDERAL LIBERTARIAN JO JORGENSEN 270
TENNESSEE PICKETT not collected US PRESIDENT FEDERAL LIBERTARIAN JO JORGENSEN 11
TENNESSEE SEVIER not collected US PRESIDENT FEDERAL LIBERTARIAN JO JORGENSEN 402
TENNESSEE WILLIAMSON any < 1% mismatch US PRESIDENT FEDERAL LIBERTARIAN JO JORGENSEN 1494
library(tidyverse)
library(readxl)

# counties to check
county_check <- read_excel("combined/compare.xlsx", sheet = 2) |> 
  filter(color2_c == "no entry in Baltz") |> 
  select(state, county_name, matches("color2"))

# which candidates are the problem
read_excel("combined/compare.xlsx", sheet = "by-cand-coalesced") |> 
  inner_join(county_check, by = c("state", "county_name")) |> 
  filter(party_detailed %in% c("DEMOCRAT", "REPUBLICAN", "LIBERTARIAN"), 
         state != "DISTRICT OF COLUMBIA") |> 
  filter(is.na(votes_v)) |> 
  select(state, county_name, 
         color2_h, 
         office:party_detailed,
         candidate_c, votes_c, votes_v)

If we can release these, we'd get two new Republican states.. TN/ID.

mreece13 commented 2 months ago

I resolved the write-in candidates in NJ and OH. Pending the next build.

kuriwaki commented 2 months ago

"I resolved the write-in candidates in NJ and OH. Pending the next build."

These don't seem to be reflected yet in the medsl on Dropbox right now, but if it's on its way, no worries.

The NJ case is annoying to deal with .. in this example, we basically want to turn the last two rows into WRITEINs. You can see how the second one is missing his middle name.

> open_dataset("medsl") |> filter(state == "NEW JERSEY", county_name == "ESSEX", office == "US HOUSE", district == "010") |> count(office, candidate, party_detailed) |> collect() |> filter(str_detect(candidate, "PAYNE"))
# A tibble: 3 × 4
  office   candidate         party_detailed      n
  <chr>    <chr>             <chr>           <int>
1 US HOUSE DONALD M PAYNE JR DEMOCRAT       146100
2 US HOUSE DONALD PAYNE JR   DEMOCRAT            1
3 US HOUSE DONALD PAYNE JR   WRITEIN             1

I linked Kenosha and Maryland to this issue too

mreece13 commented 2 months ago

Lake and Shasta resolved.

mreece13 commented 2 months ago

Some notes on the missing candidates: