systemcatch / eiapy

A simple wrapper for the U.S. Energy Information Administration API
https://pypi.org/project/eiapy/
MIT License
23 stars 8 forks source link

Handle API ignoring badly formatted time limits #16

Open systemcatch opened 4 years ago

systemcatch commented 4 years ago

As shown in the example below the eia API expects timestamps in ISO 8601 format (YYYYMMDDTHHZ) with Z meaning UTC.

However there was show to be a problem in tmrowco/electricitymap-contrib/pull/2188, if a wrong or even complete garbage time limit is passed the API just returns the last n results which is bad design.

(eiapy) chris@ThinkPad:~/eiapy$ python
Python 3.7.4 (default, Sep 19 2019, 11:01:37) 
[GCC 5.4.0 20160609] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from eiapy import Series
>>> cal_to_mex = Series('EBA.CISO-CFE.ID.H')
>>> cal_to_mex.last_from(5, "20180401T07Z")
{'request': {'command': 'series', 'series_id': 'EBA.CISO-CFE.ID.H'}, 'series': [{'series_id': 'EBA.CISO-CFE.ID.H', 'name': 'Actual Net Interchange for California Independent System Operator (CISO) to Comision Federal de Electricidad (CFE), hourly - UTC time', 'units': 'megawatthours', 'f': 'H', 'description': 'Timestamps follow the ISO8601 standard (https://en.wikipedia.org/wiki/ISO_8601). Hourly representations are provided in Universal Time.', 'start': '20150701T08Z', 'end': '20200128T08Z', 'updated': '2020-01-29T09:06:34-0500', 'data': [['20180401T07Z', -11], ['20180401T06Z', -16], ['20180401T05Z', -11], ['20180401T04Z', -7], ['20180401T03Z', -5]]}]}
>>> cal_to_mex.last_from(5, "sefwesfewf")
{'request': {'command': 'series', 'series_id': 'EBA.CISO-CFE.ID.H'}, 'series': [{'series_id': 'EBA.CISO-CFE.ID.H', 'name': 'Actual Net Interchange for California Independent System Operator (CISO) to Comision Federal de Electricidad (CFE), hourly - UTC time', 'units': 'megawatthours', 'f': 'H', 'description': 'Timestamps follow the ISO8601 standard (https://en.wikipedia.org/wiki/ISO_8601). Hourly representations are provided in Universal Time.', 'start': '20150701T08Z', 'end': '20200128T08Z', 'updated': '2020-01-29T09:06:34-0500', 'data': [['20200128T08Z', 12], ['20200128T07Z', 20], ['20200128T06Z', 26], ['20200128T05Z', 36], ['20200128T04Z', 174]]}]}

I'm trying to figure out a way of avoiding the above happening as it's surprising behaviour imo. At the bare minimum I need better documentation.

systemcatch commented 4 years ago

A simple idea would be to have a timestamp regex to enforce some kind of check on what is passed. I have to be sure that all data sets use that format though.