ropensci / weathercan

R package for downloading weather data from Environment and Climate Change Canada
https://docs.ropensci.org/weathercan
GNU General Public License v3.0
102 stars 29 forks source link

Default to UTC for multiple timezones #37

Closed steffilazerte closed 6 years ago

steffilazerte commented 7 years ago

Allow the options of converting times to 'relative' time. I.e. if multiple timeszones, force each timezone to UTC (do not convert). This means that 9am in TZ1 corresponds to 9am UTC, also 9am in TZ2 corresponds to 9am UTC. This way daily patterns are directly comparable, but the timezone attribute is a dummy variable.

boshek commented 7 years ago

I'd actually argue that all the data should in UTC and leave it up to the user to convert.

steffilazerte commented 7 years ago

Right now it follows what ECCC has: local timezone but no daylight savings (e.g. Etc/GMT-6). If the user wants it to be their timezone, they can specify that with the tz_disp argument, including UTC. If they download data from multiple timezones, they're all converted to the first timezone, so for multiple timezones, it would make sense to instead make the default timezone UTC.

In all other cases, however, I think removing the option to specify timezone and converting everything to UTC would be a step backwards in user-friendliness.

The idea I'm referring to here would be to add an argument such as 'relative_tz', which would mean that the times are converted so that they're comparable. With weather data, this is probably not something that will see a lot of use, but in other biological fields, we often don't really care what the instant in time is, but rather, what the relative time of day is.

I'm definitely not going to worry about this option too much. I think it'd be a good idea to make UTC the default for multiple timezones, but timezones are annoying enough for people to deal with that I do not want to remove any functionality that makes it easier on the user.

boshek commented 7 years ago

Interesting. Obviously your final call but I think the simplest option is to leave all dates in the same timezone (UTC) and leave it up to the user. I think the argument relative_tz introduces confusion because the user has to think about the timezone. If everything is always in UTC then the user can essentially forget about it. It is interesting that other ECCC datasource (hydrometrics, I thinking) are in UTC. I think that having this by UTC is actually easier on the user because weather patterns move across timezones. The idea of relative_tz I think makes this more obscure for the user and therefore less user friendly. Setting everything to UTC removes any potential for misanalyzed data and address looking at weather patterns across multiple timezones without the user having to think about it. My 2 cents.

steffilazerte commented 7 years ago

I'll leave out the relative_tz argument and make it so downloads with multiple timezones default to UTC. I'll leave the default for single timezones as the local tz, though.