opencivicdata / docs.opencivicdata.org

Open Civic Data project documentation
https://open-civic-data.readthedocs.io
44 stars 33 forks source link

Do Filing coverage start and end need to allow precise times? #97

Open gordonje opened 6 years ago

gordonje commented 6 years ago

On the Filing data type in the campaign finance enhancement proposal, coverage_start_date and coverage_end_date are defined as date and (possibly) time values.

None of the filings we'll be loading from California will have time data associated with these values. Obviously, that's just one state, but do we have a sense of how many jurisdictions will have precise start and end times for each filing's coverage? And even if these precise times are often available, how important is this precision beyond the start and end date?

Maybe I'm getting too far into the implementation details in this space, but I want to be clear what I'm doing. For now, I'll leave these as DateTime fields and plan to set the time portion to midnight UTC when unknown.

If the precision is important, then we might consider, someday, converting these to "fuzzy" DateTime fields that don't require time parts. OCD is already doing something like this for date fields where parts of the date are missing. But it would be better if we converted these to a custom model field with field lookups allowing users to query these fields as if they were regular Date or DateTime fields.

jsfenfen commented 6 years ago

+1 for making these date only fields, definitely don't wanna get into representing midnight in the correct timezone. 13-ish US states span time zones, don't even want to think about that.

gordonje commented 6 years ago

@jsfenfen Me neither...

aepton commented 6 years ago

I vote for date only. In the limited set of circumstances where we might conceivably care about times and not just dates (like ordering filings to pick the most recent), a) I'm not sure how much we should trust each system in the chain from Committee -> Regulator -> Us to handle times (and time zones, especially) properly; b) it seems good to be clear about what we don't know, and rely on some other signal to determine precedence.

gordonje commented 6 years ago

Sounds we don't need the time then. I'm going to switch these from datetime to date fields in our implementation in python-opencivicdata.

Do we think we'll almost always have a full date for coverage start and end? Do we want to treat a partial date value for either of these fields as an error? If so, then I can switch these to the fuzzy-date char field found in elsewhere in OCD.

aepton commented 6 years ago

Yeah, I think so. I can certainly imagine some situations where we've just got a month and year, or even a year, but I think the number of truly-non-date instances will be small and we should error on those that lack a specific date. I hope that sentence made sense.

aepton commented 6 years ago

Fixed in https://github.com/opencivicdata/docs.opencivicdata.org/pull/104