swtools / WOFpy

DEPRECATED! WOFpy development has moved to ODM2/WOFpy
11 stars 14 forks source link

Who formats the ISO times #6

Open twhiteaker opened 13 years ago

twhiteaker commented 13 years ago

Shall we require DAO to return ISO date strings, or shall WOF handle that? Performance hit if both do it, so we should specify that it should be one or the other.

wilsaj commented 13 years ago

I agree that parsing date strings twice is no good, but I'd really like to have the option of using python datetime objects because:

  1. parsing date strings is annoying
  2. it's not uncommon for data models to hand you datetime objects anyway (e.g. via SQLAlchemy's datetime column type)

So I propose that if a date is given as a string, we do nothing, if it is a datetime object then we parse it in wofpy. Performance-wise, this just adds two type checks if you are using strings, which is rather trivial.

I should have grouped these as a pull request, but see commits: https://github.com/swtools/WOFpy/commit/a7097f0d32b7cf47b62fc473e35727e520ef156e https://github.com/swtools/WOFpy/commit/7d81236dfe2bae530647768604be65964accbbcd

twhiteaker commented 13 years ago

sounds good. i like raising the value error if datetimes are passed without tzinfo. can the default be to report local time rather than utc time?

wilsaj commented 13 years ago

The only time the default comes up is if both UTC and localtime are defined. I'd like to hear thoughts in favor of local time. I lean towards UTC for a few (minor) reasons:

I often encounter situations where I need to compare timeseries data where multiple offsets are used. This happens when data come from different data sources, locations or from the same location with different daylight savings time offsets. Converting between just a few timezones and DST settings gets hairy.

On the storage end there are a few datastores (e.g. sqlite and mysql) that don't support arbitrary timezone offsets in their datetime datatypes. They can only store either UTC or a single server-wide offset that is applied to datetimes. For datastores that do have proper support for timezone offsets in their datetime datatypes, the timezone'd versions are often more expensive (2 extra bytes per datetime on MS SQL Server).

It's possible to work around such issues and limitations but in general it's just simpler to store and compare datetimes that are represented in UTC.

twhiteaker commented 13 years ago

Using local times is convenient for me. I like being able to download a time series and look at a chart exhibiting diurnal variations without converting to local datetimes either in the software or in my head. Example: Is the pattern for this watershed in Arizona from sunrise to noon the same as the pattern in a watershed in Maine from sunrise to noon?

I suppose software should ask you what time zone you want to display data in and do the conversion for you. Our current HIS client software doesn't do that. Converting between time zones shouldn't get hairy if UTC offset is specified.

As an example on the storage side, the ODM uses three non-timezone-aware fields related to datetimes: LocalDateTime (date), DateTimeUTC (date), and UTCOffset (float, in hours).

If we go with UTC time, then it would be convenient if the time zone information was always included in the site info part of the WaterML response.

wilsaj commented 13 years ago

My frustration is with existing services that don't provide offsets. If you are looking at general trends this doesn't matter as much but specific times could matter if, for example, you are looking at rain gauge data for a particular storm event and comparing those with other meteorological data.

I'd like to discourage, as much as possible, ambiguous datetimes. But if a service is providing ambiguous datetimes, that's an issue with the service itself and there are probably better ways to enforce that then giving UTC precedence here.

twhiteaker commented 13 years ago

For that reason, I liked how you raised an error if the datetime object that wof received was not time zone aware.