dpguthrie / yahooquery

Python wrapper for an unofficial Yahoo Finance API
https://yahooquery.dpguthrie.com
MIT License
754 stars 133 forks source link

Unable to retrieve historical data due to unexpected keyword argument 'tz' #169

Open rinkef opened 1 year ago

rinkef commented 1 year ago

Hi, when trying to get historical data using Ticker I get the following error:

fromtimestamp() got an unexpected keyword argument 'tz'

Something seems to be going wrong with the timestamp in the pandas module. Upgrading the pandas module and the yahooquery module does not help.

Hence I can't extract historical data.

I am using Python 3.9

My very simple script

from yahooquery import Ticker aapl = Ticker('AAPL') data = aapl.history() print(data.head())

What am I missing here?

Full error: Traceback (most recent call last):

File ~\Miniconda3\lib\site-packages\spyder_kernels\py3compat.py:356 in compat_exec exec(code, globals, locals)

File c:\data\python\git\personal-finance\yahooquery_test.py:14 data = aapl.history()

File ~\Miniconda3\lib\site-packages\yahooquery\ticker.py:1291 in history for i in range(len(dates) - 1):

File ~\Miniconda3\lib\site-packages\yahooquery\ticker.py:1320 in _historical_data_to_dataframe df["splits"].fillna(0, inplace=True)

File ~\Miniconda3\lib\site-packages\yahooquery\utils__init__.py:198 in _history_dataframe tz = data["meta"]["exchangeTimezoneName"]

File ~\Miniconda3\lib\site-packages\yahooquery\utils__init__.py:125 in _get_daily_index last_trade = pd.Timestamp.fromtimestamp(timestamp)

File pandas_libs\tslibs\timestamps.pyx:1129 in pandas._libs.tslibs.timestamps.Timestamp.fromtimestamp

TypeError: fromtimestamp() got an unexpected keyword argument 'tz'

maread99 commented 1 year ago

Hi @rinkef. I'm unable to reproduce the error. Could you execute the following and post the print. Thanks.

import yahooquery
import pandas
print(yahooquery.__version__)
print(pandas.__version__)
rinkef commented 1 year ago

Hi @maread99, sure:

Yahoo query: 2.3.0 Pandas: 1.5.3

maread99 commented 1 year ago

All seems a bit odd. The traceback seems to be out of sync with the actual code, and 'tz' is a keyword argument of Timestamp.fromtimestamp.

Could you install yahooquery 2.3.1 to a new environment via pip and let me know if it's still raising.

RudyNL commented 1 year ago

Same problem here Yahoo query: 2.3.1 Pandas: 1.3.5

>>> import yahooquery
>>> aapl = yahooquery.Ticker('AAPL')
>>> data = aapl.history()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/rudy/.local/lib/python3.10/site-packages/yahooquery/ticker.py", line 1291, in history
    df = self._historical_data_to_dataframe(data, params, adj_timezone)
  File "/home/rudy/.local/lib/python3.10/site-packages/yahooquery/ticker.py", line 1320, in _historical_data_to_dataframe
    d[symbol] = _history_dataframe(data[symbol], daily, adj_timezone)
  File "/home/rudy/.local/lib/python3.10/site-packages/yahooquery/utils/__init__.py", line 198, in _history_dataframe
    index = _get_daily_index(data, index, adj_timezone)
  File "/home/rudy/.local/lib/python3.10/site-packages/yahooquery/utils/__init__.py", line 125, in _get_daily_index
    last_trade = pd.Timestamp.fromtimestamp(timestamp, tz="UTC")
  File "pandas/_libs/tslibs/timestamps.pyx", line 1129, in pandas._libs.tslibs.timestamps.Timestamp.fromtimestamp
TypeError: fromtimestamp() got an unexpected keyword argument 'tz'

Going back to Yahoo query: 2.3.0 solves the problem. Upgrading Pandas to 1.5.3 in combination with Yahoo query 2.3.1 doesn't solve the problem.

maread99 commented 1 year ago

Hi @RudyNL, that's not great. I still can't reproduce this error (it's working ok for me). I'm struggling to see how it could be raised with pandas 1.5.3 and yahooquery 2.3.1.

This PR included to pandas 1.4 changed the implementation of fromtimestamp to include the 'tz' option, while at the same time deprecating the utcfromtimestamp method which did something similar.

It's working with pandas 1.3.5 / yahooquery 2.3.0 because passing the 'tz' option was only introduced to yahooquery in 2.3.1 (the 2.3.1 release addressed some bugs in 2.3.0). The error will raise with pandas 1.3.5 / yahooquery 2.3.1 although there shouldn't be a problem with pandas 1.5.3 / yahooquery 2.3.1.

Could anyone with this problem please run the following and post the first 7 lines of the print...

import yahooquery as yq
import pandas as pd
print(yq.__version__)
print(pd.__version__)
help(pd.Timestamp.fromtimestamp)

and also the output from:

pd.Timestamp.fromtimestamp(1584199972, tz="UTC")

Thanks

RudyNL commented 1 year ago

I can not reproduce the error anymore with Yahoo query 2.3.1 and Pandas 1.5.3

>>> import yahooquery as yq
>>> import pandas as pd
>>> print(yq.__version__)
2.3.1
>>> print(pd.__version__)
1.5.3
>>> pd.Timestamp.fromtimestamp(1584199972, tz="UTC")
Timestamp('2020-03-14 15:32:52+0000', tz='UTC')

This can mean that I made an error, but also that the Yahoo input changed. I am running a larger program which still produces errors. FutureWarning: Comparison of Timestamp with datetime.date is deprecated in order to match the standard library behavior. In a future version these will be considered non-comparable. Use 'ts == pd.Timestamp(date)' or 'ts.date() == date' instead. and ValueError: Tz-aware datetime.datetime cannot be converted to datetime64 unless utc=True I will have a look at it tomorrow.

RudyNL commented 1 year ago

I am trying to reproduce yesterdays errors. An unchanged program with the same version of libraries is now NOT producing errors. It looks like the Yahoo input data is influencing the errors. I am now doing large scale testing (retrieving data of 900 stocks) to reproduce the tz error.

RudyNL commented 1 year ago

I found a problem in the Yahoo data. Just run:

from yahooquery import Ticker
aapl = Ticker('AAPL')
data = aapl.history()

The last line of the data has timezone information

       2023-03-20                 155.070007  157.820007  154.149994  157.399994   73641400  157.399994       0.00
       2023-03-21                 157.320007  159.399994  156.539993  159.279999   73938300  159.279999       0.00
       2023-03-22                 159.300003  162.139999  157.809998  157.830002   75701800  157.830002       0.00
       2023-03-23 16:00:04-04:00  158.830002  161.550003  157.679993  158.929993   65800270  158.929993       0.00

Its obvious that later after closing the stock exchange the timezone information will be removed by Yahoo. This will give all kind of problems in further processing.

maread99 commented 1 year ago

This is the intended behavior, as described by the 'Returns' section of the method doc.

During a trading session the 'live indice' indicates the time of the last trade. Indexing in this manner prevents ambiguities as to whether this last indice represents the price of a closed or open session. If you look back through the issues you'll see that how to treat this indice has been a matter of past debate.

At some point after the close the last indice will revert to a date (exactly when depends on Yahoo)).

If you're after post-processing, you might be interested in market_prices. It sits on top of yahooquery and provides for enhanced querying and post-processing of price data.

rinkef commented 1 year ago

@maread99 @RudyNL I tried once more (without changing anything) and now it does work, no errors anymore. Only thing that changed was that I rebooted and I changed my internet connection from using VPN to not using VPN anymore. Not sure if that could have had an influence in solving the problem.

RudyNL commented 1 year ago

@rinkef

without changing anything

Are you sure? Rebooting is often combined with upgrading. So maybe your system has been upgraded. My problems are time dependent. The time-zone information provided by Yahoo is time-dependent. By post-processing I am removing timezone information.

rdstock-er commented 1 year ago

For what it's worth, I tried using yahooquery for the first time today (fresh install of version 2.3.1) and had the exact same error: fromtimestamp() got an unexpected keyword argument 'tz' I was on pandas 1.3.0, so I upgraded to pandas 2.0.0 and the problem disappeared.

maread99 commented 1 year ago

To summarize,

Installing to a new virtual environment (created with venv) will ensure the latest versions of dependencies are installed.