Closed nbdavies closed 5 years ago
I revised parser.py to print a note when columns are missing in these 2000-2010 files. This occurs in 3 data sections in fall primary election files (once in 2006, twice in 2008).
I've added a kludge to set the district to 14 when the office column is missing, as that occurs only in one section of Libertarian_2008_FallElection_StateSenator_WardbyWard.xls (id 431).
These changes are in my branch dp-add-district-to-tests.
This test is currently failing:
Because the column is in fact blank in the CSV output:
And that's because the table in the source file is one in the middle of a pre-2010 file with some of the columns cut off: We would normally extract the district from the office name column, but it isn't present here. We're copying over the office name from the previous chunk of the file, which is correct. But carrying over the district number would be incorrect: the previous section of the file is for district 12, and the next section is for district 16.