hydrosquall / tiingo-python

Python client for interacting with the Tiingo Financial Data API (stock ticker and news data)
https://pypi.org/project/tiingo/
MIT License
254 stars 54 forks source link

Intraday code gets confused by daylight savings timezone cutover #693

Open thirtythreeforty opened 3 years ago

thirtythreeforty commented 3 years ago

Description

Was trying to download IEX intraday by walking backward through dates.

What I Did

In [111]: q = c.get_dataframe('AAPL', endDate='2021-04-09T12:16:00.000000000', startDate='2021-02-08 12:16:00', fmt='csv', frequency='1Min')
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-111-384784fbb5a0> in <module>
----> 1 q = c.get_dataframe('AAPL', endDate='2021-04-09T12:16:00.000000000', startDate='2021-02-08 12:16:00', fmt='csv', frequency='1Min')

~/Downloads/ib-historical-data/.venv/lib/python3.9/site-packages/tiingo/api.py in get_dataframe(self, tickers, startDate, endDate, metric_name, frequency, fmt)
    292         if pandas_is_installed:
    293             if type(tickers) is str:
--> 294                 prices = self._request_pandas(
    295                     ticker=tickers, params=params, metric_name=metric_name)
    296             else:

~/Downloads/ib-historical-data/.venv/lib/python3.9/site-packages/tiingo/api.py in _request_pandas(self, ticker, metric_name, params)
    203         # Localize to UTC to ensure equivalence between data returned in json format and
    204         # csv format. Tiingo daily data requested in csv format does not include a timezone.
--> 205         if prices.index.tz is None:
    206             prices.index = prices.index.tz_localize('UTC')
    207 

AttributeError: 'Index' object has no attribute 'tz'

This happens because when the code tries to convert the index to_datetime, pandas returns a plain Index:

https://github.com/hydrosquall/tiingo-python/blob/6b032a0219290e1c37878687f2316c6c4c6cdb10/tiingo/api.py#L212-L217

...presumably because it can't work out the correct timezone of the data, since it has the DST cutover, sample:

2021-03-12 15:55:00-05:00,120.78,121.085,120.78,120.94,22801.0
2021-03-12 16:00:00-05:00,120.94,120.94,120.94,120.94,0.0
2021-03-15 09:30:00-04:00,121.4,121.49,120.425,120.9,39581.0
2021-03-15 09:35:00-04:00,120.885,121.33,120.83,120.96,23986.0
hydrosquall commented 3 years ago

Hi @thirtythreeforty , thanks for the detailed report.

In the short-term, I would advise using the non-dataframe oriented API to download the raw data and parse them into datetimes using your own preferred logic, since that API is simpler, and doesn't attempt to do any type related data coercion.

client.get_ticker_price('AAPL', ...)
cavart28 commented 1 year ago

Possibly related is the following interesting behavior, unexpected to me. By querying an interval of 2 minutes, I would expect 2 (or 3) minutes but for some reason I get a dataframe with 10 rows, starting with the correct date/time but ending with a time of 15:59


start = '2023-03-15 15:50:00'
end = '2023-03-15 15:52:00'

df =  tiingo_client.get_dataframe(
        tickers='QQQ', frequency='1min', startDate=start, endDate=end
    )

df
image