In the current release, there is just one entry that has a party (short) value of W-I. This is a voter in Camden, NJ who wrote in "PROGRESSIVE" as a cadndate for NJ-01 House. I would like to remove this because it is confusing; it looks like there is only one write-in vote in the whole dataset.
> count(ds, party) |> collect()
# A tibble: 7 × 2
party n
<chr> <int>
1 DEM 82384509
2 REP 68311630
3 LBT 1009953
4 NA 6753458
5 OTH 2249594
6 GRN 209927
7 W-I 1
What I have tried:
I have already updated the formatting code to not have a W-I value: https://github.com/kuriwaki/cvr_harvard-mit_scripts/commit/4354e5315bd8b8144abdda4911d2688b3e001d17. Even though I have run this new code, the Camden parquet partition is not update and so the W-I persists. I have tried setting the existing_data_behavior argument in open_dataset to both "overwrite" and "delete_matching", but Camden does not seem to cooperate(?)
In the current release, there is just one entry that has a
party
(short) value ofW-I
. This is a voter in Camden, NJ who wrote in "PROGRESSIVE" as a cadndate for NJ-01 House. I would like to remove this because it is confusing; it looks like there is only one write-in vote in the whole dataset.What I have tried:
W-I
value: https://github.com/kuriwaki/cvr_harvard-mit_scripts/commit/4354e5315bd8b8144abdda4911d2688b3e001d17. Even though I have run this new code, the Camden parquet partition is not update and so the W-I persists. I have tried setting theexisting_data_behavior
argument inopen_dataset
to both"overwrite"
and"delete_matching"
, but Camden does not seem to cooperate(?)