kuriwaki / cvr_harvard-mit_scripts

6 stars 1 forks source link

All-undervote districts in GA state house #327

Closed kuriwaki closed 2 months ago

kuriwaki commented 2 months ago

MEDSL (and by implication the release) has too many undervotes in Georgia state house. Harvard records undervotes 309k in the state house, but MEDSL reports 659k (below). We were not checking undervotes before so this was not caught.

Fortunately this only seems to be limited office == "STATE HOUSE".

One example where we see how this happens is in this precinct in Fayette county:

open_dataset("release") |> filter(state == "GEORGIA", precinct == "358-Camp Creek", office == "STATE HOUSE") |> count(district) |> collect()
#   district     n
#   <chr>    <int>
# 1 072       1539
# 2 073       1539

It shows two districts, but there is only one state house district in the precinct, 72. The entry for 73 are wrong and are all undervotes. Harvard does not have this issue.

Fayette does have both districts in the county, just not in that precinct (https://fayettecountyga.gov/elections/archives/2020-11-03-ElectionSummaryReportRPTNov3.pdf). I don't know what the underlying issue or set of precincts is yet.

PATH_parq <- "~/Dropbox/CVR_parquet"

open_dataset(path(PATH_parq, "medsl")) |> filter(state == "GEORGIA", office == "STATE HOUSE") |> collect() |> as_tibble() |> mutate(party_detailed = ifelse(candidate == "UNDERVOTE", "UNDERVOTE", party_detailed)) |>  count(party_detailed, sort = TRUE)

# # A tibble: 6 × 2
#   party_detailed       n
#   <chr>            <int>
# 1 REPUBLICAN     1805320
# 2 DEMOCRAT       1558307
# 3 UNDERVOTE       659668
# 4 OTHER            27690
# 5 INDEPENDENT        413
# 6 WRITEIN              1

open_dataset(path(PATH_parq, "harvard")) |> filter(state == "GEORGIA", office == "STATE HOUSE") |> collect() |> as_tibble() |>  count(party_detailed, sort = TRUE)

#   party_detailed       n
#   <chr>            <int>
# 1 "REPUBLICAN"   1795853
# 2 "DEMOCRAT"     1556929
# 3 "undervote"     309575
# 4 "W-I"            27509
# 5 "INDEPENDENT"      280
# 6 ""                 133
mreece13 commented 2 months ago

Fixed this, it was related to the manual classification, and some other issues with misclassifications in Georgia. Probably I did this part of the state late at night or something and was not as accurate. Pending build.

mreece13 commented 2 months ago

Resolved now.