open-data-standards / permitdata.org

:hammer: A website for the BLDS Data Specification
https://permitdata.org
39 stars 7 forks source link

Duplicate fields for date/time in Inspections dataset #58

Open mmartin78 opened 9 years ago

mmartin78 commented 9 years ago

Why does the inspection dataset has two separate fields for date and time for things like InspectedDate, ScheduledDate and so on? This seems redundant a single date field can normally incorporate both date and time in any chosen standard format.

bettin commented 9 years ago

By using a combined timestamp field, you are required to enter a time (00:00:00 even). I'm just getting up to speed, but going to guess that they are separate fields so you do not need a specific time.

mmartin78 commented 9 years ago

Normally you are not required to enter time for date fields. Time will default to 00:00:00 if you don't enter one.

bettin commented 9 years ago

Correct. 00:00:00 would define the time as midnight, even if it isn't the actual time of the inspection. Once converted, there is no way to determine if the original time was 00:00:00 or NULL.

mmartin78 commented 9 years ago

Are jurisdictions really doing inspections at midnight? It does not seem like this would be a problem...

bettin commented 9 years ago

It depends on how the timestamps are stored. If they are stored in UTC, to remove all timezone issues, this would be a very valid time for inspections.

Essentially the main issue of adding a timestamp if there is no defined time is the same as adding erroneous data. If "time" was a required field instead of optional, then a timestamp format would make sense.

mmartin78 commented 9 years ago

@bettin you have a good point, but the standard, as it is, does not specify UTC or any time zone, which in itself would seem to be a problem as well. Also specifying time as a free text field does not seem to me like a good solution for a standard since it is open for people to put anything in there.

Seems like a better solution is needed, wouldn't you agree? Will look in more detail at timestamp formats to see if there is an option that can address your concern.

bettin commented 9 years ago

Yes, a text field is not the ideal solution, so if you have a type that would be great. I've heard some people use INTs as seconds to count from 00:00:00, but that seems overly complex.

It would be nice to know how many rows do not include a time value. Also, it would be great to have a standard timezone (UTC always gets my vote).

mmartin78 commented 9 years ago

Looking a bit more at the possibilities it seems that a common way to represent dates / times on the internet that meets our requirements (i.e. all in one field, but still be able to specify optional time) is to use ISO 8601 format or it's simpler profile described in RFC 3339.

I believe this takes care of all the concerns, most software tools can export data in this formats and it is already a standard.

bettin commented 9 years ago

Yes, that could be interesting.