openelections / openelections-core

Core repo for election results data acquisition, transformation and output.
MIT License
179 stars 98 forks source link

Should precinct-level results have a county field baked as well? #243

Closed dwillis closed 9 years ago

dwillis commented 9 years ago

I don't want to open a can of worms, but it seems to me that it might be helpful for people using the CSV files if we had a county string field in RawResult in addition to the OCD Division ID. Thoughts?

ghing commented 9 years ago

The need for this makes sense to me, but we'll have to update all the existing loaders.

I wonder if we should call it county or parent_jurisdiction (or other things, but you get the distinction) just to keep our data model flexible.

dwillis commented 9 years ago

Yeah, I figure if we're gonna do this, the sooner the better. I think parent_jurisdiction is probably a better label, given that some states have parishes instead of counties and others have cities that are independent of counties.

zstumgoren commented 9 years ago

@dwillis Is the county name already in the source data or would it have be dynamically determined in some way? If former, the parent_jurisdiction field seems like a good idea; if latter, seems like a natural fit for a transform step downstream.

dwillis commented 9 years ago

In each state I've done, the county name is in the source data, but I suppose that might not always be the case.

zstumgoren commented 9 years ago

@dwillis Sounds like a case where we want to promote an expando field to a formal field. That makes sense. Can we make this a non-required field for cases where there is no parent jurisdiction native to to the source data? That way the assignment of county name can be deferred to the transform stage, as needed.

dwillis commented 9 years ago

@zstumgoren works for me.

zstumgoren commented 9 years ago

Cool. So barring any final thoughts from @ghing, sounds like we should go forward with adding the parent_jurisdiction field.

ghing commented 9 years ago

My experience has also been that the county name is available, either as a field in the file, or in the metadata because the precinct-level file is aggregate with one file per county.

I agree that this field should be listed in the RawResult model definition, but should be an optional field.

Finally, I agree that we should set this in loaders only when its easy an defer more complex lookups to a transform.