Closed kuriwaki closed 2 months ago
Once we decide what to do with this, my recommendation is to then change Howie Hawkins' party designation to a write-in (strip the Green designation) in
this is according to my assessment of #298
I think the correction to the classification script was the correct choice, we cannot add anything more to the CVR data that is not there. I can change Howie Hawkins to a WRITEIN party in those jurisdictions, I think it makes sense as well. Pending build.
I think we want to do more changes than Hawkins. currently the candidate == "WRITEIN"
records are listed as party_detailed == OTHER
in CVR_parquet/medsl
. So I think we want to change them all to party_detailed == "WRITEIN", and change the designation of qualified write-ins like Steve Horn to party == WRITEIN (or, as Baltz et al. does, make a written column).
library(tidyverse)
library(arrow)
# current data -- should be party = WRITEIN
open_dataset("release") |>
filter(candidate == "STEVE ZORN") |>
count(state, candidate, party_detailed) |>
collect()
#> # A tibble: 2 × 4
#> state candidate party_detailed n
#> <chr> <chr> <chr> <int>
#> 1 COLORADO STEVE ZORN INDEPENDENT 17
#> 2 COLORADO STEVE ZORN DEMOCRAT 16
# what about other writeins?
open_dataset("release") |>
filter(candidate == "WRITEIN") |>
count(state, candidate, party_detailed) |>
collect()
#> # A tibble: 15 × 4
#> state candidate party_detailed n
#> <chr> <chr> <chr> <int>
#> 1 ARIZONA WRITEIN OTHER 29145
#> 2 CALIFORNIA WRITEIN OTHER 5615
#> 3 COLORADO WRITEIN OTHER 1275
#> 4 FLORIDA WRITEIN OTHER 1905
#> 5 FLORIDA WRITEIN NONPARTISAN 698
#> 6 GEORGIA WRITEIN OTHER 73150
#> 7 ILLINOIS WRITEIN OTHER 121
#> 8 MARYLAND WRITEIN OTHER 2474
#> 9 MICHIGAN WRITEIN OTHER 316
#> 10 NEW JERSEY WRITEIN OTHER 3961
#> 11 OHIO WRITEIN OTHER 8322
#> 12 OREGON WRITEIN OTHER 14046
#> 13 TEXAS WRITEIN OTHER 1426
#> 14 WISCONSIN WRITEIN OTHER 8907
#> 15 IOWA WRITEIN OTHER 184
# need to fix Florida
# what about Baltz data
open_dataset("returns/by-county/") |>
filter(candidate == "WRITEIN") |>
count(candidate, party_detailed) |>
collect()
#> # A tibble: 1 × 3
#> candidate party_detailed n
#> <chr> <chr> <int>
#> 1 WRITEIN WRITEIN 20850
# what about Steve Zorn in Baltz data?
open_dataset("returns/by-county/") |>
filter(candidate == "STEVE ZORN") |>
count(candidate, writein, party_detailed) |>
collect()
#> # A tibble: 1 × 4
#> candidate writein party_detailed n
#> <chr> <dbl> <chr> <int>
#> 1 STEVE ZORN 1 DEMOCRAT 2
Created on 2024-07-05 with reprex v2.1.0
One correction to my repress above. The non-qualified write-ins are actually listed as NA
in the Baltz data (except in DC, and state leg offices for KS + SC, for some reason). I was just overwriting them WRITEIN.
@mreece13 you can check out my commit in https://github.com/kuriwaki/cvr_harvard-mit_scripts/commit/b030c4d6279f0df315da27783c542d0cdb3bf572 that will overwrite the party of anyone who is candidate == "WRITEIN"
(let's call them unqualified writeins) to party = WRITEIN.
That looks good to me. Most of the MEDSL data should also reflect this change now (I am currently syncing a new version of the data to Dropbox). I have only re-built the counties we were considering releasing so some of them are missing it still.
Ok, so unqualified write-ins seems standardized now. Great! I'll cautiously close this momentarily with the pull request.
Qualified write-ins are hard to mark as party_detailed == "WRITEIN" because we cannot easily detect them, especially if we have accidentally given them parties. I guess that's something to note. By the way CVRs probably lose write-ins completely if they are a csv format from ESS DS200 (#123)
Remaining issues on this thread like Kenosha and Maryland seem more like things to change on the returns side, not car side (#328). I will transfer those issues there.
I also verified Howie Hawkins
Making an issue so we can resolve it before release. There are a few counties where this seems to be causing unnecessary red flags.
Qualified writeins seem to be writeins whose name and party are not printed on the ballot, but are sometimes recorded as having a party when a voter writes in their name. (see https://github.com/kuriwaki/cvr_harvard-mit_scripts/pull/319#issuecomment-2206828384)
I see three options for these
party_detailed
as D/Rparty_detailed == "OTHER"
, just like a generic write-in (candidate == "WRITEIN"
)party_detailed == NA
(EDIT 7/6: or party_detailed == "WRITEIN"). Currently almost all unqualified write-ins in medsl (Reece) have party_detailed == "OTHER"writein
like the Baltz et al. MEDSL formatI lean towards option 3, and if we have time, 4. The Baltz et al. dataset seems to do only 4, while retaining the party of the qualified write-in. For example Steve Zorn of CO-07 noted in #196 is listed the following way in the @sbaltzmit precinct sql file. This is rational, but I'm not sure if we can ensure this for all our cvrs at this point.
Relevant issues:
196
83
https://github.com/kuriwaki/cvr_harvard-mit_scripts/issues/118#issuecomment-2128234642
Maryland contests noted by @jloffredo2 in https://github.com/kuriwaki/cvr_harvard-mit_scripts/pull/319#issuecomment-2206662954, I copy-paste them here
Baltimore County (MD): CHARLES U and RAY BLY are now listed as candidates for US HOUSE 007 in the precinct returns. They are listed as write-in candidates with D/R designations according to https://elections.maryland.gov/elections/2020/results/general/gen_results_2020_4_00807.html and precinct results but are not in the CVR. This was not previously listed by me in https://github.com/kuriwaki/cvr_harvard-mit_scripts/issues/222
Baltimore City (MD): CHARLES U and RAY BLY are now listed as candidates for US HOUSE 007. They are listed as write-in candidates with D/R designations according to https://elections.maryland.gov/elections/2020/results/general/gen_results_2020_4_00807.html and precinct results but are not in the CVR. This was not previously listed by me in https://github.com/kuriwaki/cvr_harvard-mit_scripts/issues/223
Carroll (MD): LIH YOUNG is now listed as candidates for US HOUSE 008. They are listed as a write-in candidate with D designations according to https://elections.maryland.gov/elections/2020/results/general/gen_results_2020_4_00808.html and precinct results but are not in the CVR. This was not previously listed by me in https://github.com/kuriwaki/cvr_harvard-mit_scripts/issues/224
Frederick (MD): LIH YOUNG is now listed as candidates for US HOUSE 008. They are listed as a write-in candidate with D designations according to https://elections.maryland.gov/elections/2020/results/general/gen_results_2020_4_00808.html and precinct results but are not in the CVR. This was not previously listed by me in https://github.com/kuriwaki/cvr_harvard-mit_scripts/issues/226
Howard (MD): CHARLES U and RAY BLY are now listed as candidates for US HOUSE 007. They are listed as write-in candidates with D/R designations according to https://elections.maryland.gov/elections/2020/results/general/gen_results_2020_4_00807.html and precinct results but are not in the CVR. This was not previously listed by me in https://github.com/kuriwaki/cvr_harvard-mit_scripts/issues/229