Matt-Brigida / EIAdata

R Wrapper for the Energy Information Administration (EIA) API
16 stars 15 forks source link

Timezones confusion #11

Closed jlprol closed 4 years ago

jlprol commented 4 years ago

When running getEIA(key = key, ID = "EBA.CAL-ALL.D.HL") (California demand local time) I get

Warning messages:
1: In strptime(xx, f, tz = tz) : unknown timezone '%Y%m%d %H:%M:%S'
2: In as.POSIXct.POSIXlt(x) : unknown timezone '%Y%m%d %H:%M:%S'
3: In strptime(xx, f, tz = tz) : unknown timezone '%Y%m%d %H:%M:%S'
4: In as.POSIXct.POSIXlt(x) : unknown timezone '%Y%m%d %H:%M:%S'
5: In strptime(xx, f, tz = tz) : unknown timezone '%Y%m%d %H:%M:%S'
6: In as.POSIXct.POSIXlt(x) : unknown timezone '%Y%m%d %H:%M:%S'
7: In strptime(xx, f, tz = tz) : unknown timezone '%Y%m%d %H:%M:%S'
8: In as.POSIXct.POSIXlt(x) : unknown timezone '%Y%m%d %H:%M:%S'
9: In strptime(xx, f, tz = tz) : unknown timezone '%Y%m%d %H:%M:%S'
10: In as.POSIXct.POSIXlt(x) : unknown timezone '%Y%m%d %H:%M:%S'
11: In strptime(x, f, tz = tz) : unknown timezone '%Y%m%d %H:%M:%S'
12: In as.POSIXct.POSIXlt(as.POSIXlt(x, tz, ...), tz, ...) :
  unknown timezone '%Y%m%d %H:%M:%S'
13: 'xts::`indexTZ<-`' is deprecated.
Use 'tzone<-' instead.
See help("Deprecated") and help("xts-deprecated").

And all the datetime indexes of the resulting xts object are the same:

                                EBA.CAL.ALL.D.HL
2015-07-01 02:00:00            38210
2015-07-01 02:00:00            35171
2015-07-01 02:00:00            33243
2015-07-01 02:00:00            31955
2015-07-01 02:00:00            31199
2015-07-01 02:00:00            31540

A similar behaviour happens with the UTC version getEIA(key = key, ID = "EBA.CAL-ALL.D.H")

Warning messages:
1: In strptime(xx, f, tz = tz) : unknown timezone '%Y%m%d %H:%M:%S'
2: In as.POSIXct.POSIXlt(x) : unknown timezone '%Y%m%d %H:%M:%S'
3: In strptime(x, f, tz = tz) : unknown timezone '%Y%m%d %H:%M:%S'
4: In as.POSIXct.POSIXlt(as.POSIXlt(x, tz, ...), tz, ...) :
  unknown timezone '%Y%m%d %H:%M:%S'
5: 'xts::`indexTZ<-`' is deprecated.
Use 'tzone<-' instead.
See help("Deprecated") and help("xts-deprecated").

In this case the datetime indexes of the xts are not the same for all observations, but are different from the actual dates. The datetime of the first observation ishould be 20150701T05Z ( see here: https://www.eia.gov/opendata/qb.php?category=3389936&sdid=EBA.CAL-ALL.D.H) but instead I get:

                       EBA.CAL.ALL.D.H
2015-07-01 10:00:00           38210
2015-07-01 11:00:00           35171
2015-07-01 12:00:00           33243
2015-07-01 13:00:00           31955
2015-07-01 14:00:00           31199
2015-07-01 15:00:00           31540

When I check the timezone() of the xts object I only get ""

Thank you.

jlprol commented 4 years ago

Actually it can be easily fixed by just getting the UTC version and defining the timezone in the call itself (or later), e.g.

as.xts(getEIA(key = key, ID = "EBA.CAL-ALL.D.H"), tz = "UTC") for UTC or as.xts(getEIA(key = key, ID = "EBA.CAL-ALL.D.H"), tz = "America/Los_Angeles") for the corresponding local time.

Matt-Brigida commented 4 years ago

Thank you for posting the issue and the workaround. I can replicate them, and will start a new branch to work on a fix.

Matt-Brigida commented 4 years ago

I think I have it (mostly) fixed. Now it should detect whether the hourly series is in UTC or local time and behave accordingly. If it is local time it tries to establish the time zone via the UTC offset. If it cant determine the time zone it converts to GMT and returns a message. I have tested everything except the case when it can't find a time zone.

If you get a chance, you can test the changes with:

library(githubinstall)
gh_install_packages("EIAdata", ref = "issue_11_tz_fix") 
jlprol commented 4 years ago

Thanks again!

Matt-Brigida commented 4 years ago

I merged the issue_11_tz_fix branch into master, and it is included in release v0.1.0.

Thanks for identifying the issue! Have a great day.