ranaroussi / yfinance

Download market data from Yahoo! Finance's API
https://aroussi.com/post/python-yahoo-finance
Apache License 2.0
13.27k stars 2.34k forks source link

'Adj Close' missing in both 0.1.96 and 0.2.4 #1333

Open RudyNL opened 1 year ago

RudyNL commented 1 year ago

If you are looking at the Yahoo site "Historical Data" you are seeing the following columns Date | Open | High | Low | Close* | Adj Close** | Volume In my code it was always possible to refer to the 'Adj Close' close column. Somehow this column got lost.

>>> import yfinance as yf
>>> yf.Ticker("MSFT").history().keys()
Index(['Open', 'High', 'Low', 'Close', 'Volume', 'Dividends', 'Stock Splits'], dtype='object')

This problem can be avoided by using

>>> from pandas_datareader import data as pdr
>>> yf.pdr_override()
>>> pdr.get_data_yahoo(["MSFT"]).keys()
[*********************100%***********************]  1 of 1 completed
Index(['Open', 'High', 'Low', 'Close', 'Adj Close', 'Volume'], dtype='object')
RudyNL commented 1 year ago

After code inspection I am understanding what is wrong:

>>> import yfinance as yf
>>> yf.Ticker("TMO").history(period='max').tz_localize(None).T['2022-12-13']
Open            5.765372e+02
High            5.844030e+02
Low             5.664524e+02
Close           5.710800e+02
Volume          1.867300e+06
Dividends       0.000000e+00
Stock Splits    0.000000e+00
Name: 2022-12-13 00:00:00, dtype: float64

And compare this with:

>>> from pandas_datareader import data as pdr
>>> yf.pdr_override()
>>> pdr.get_data_yahoo(["TMO"],period='max').tz_localize(None).T['2022-12-13']
[*********************100%***********************]  1 of 1 completed
Open         5.768400e+02
High         5.847100e+02
Low          5.667500e+02
Close        5.713800e+02
Adj Close    5.710800e+02
Volume       1.867300e+06
Name: 2022-12-13 00:00:00, dtype: float64

Notice that in the first example the value of Close is equal to the value of Adj Close in the second example. This is an undocumented feature. The result of "history" is in fact Adj Open, Adj Low, Adj High and Adj Close and is called Open, Low, High and Close. Its quite confusing that yfinance is using misleading names and that the results are not equal to the results from the Yahoo site. It would be preferable to have two functions "history" and "adj_history" instead. The function "history" delivers the correct Open, High, Low, Close, Dividends and Stock Splits and the function "adj_history" delivers Adj Open, Adj High, Adj Low, Adj Close, Dividends and Stock Splits.

ValueRaider commented 1 year ago

Documentation is the weak point of yfinance compared to the big modules, but I should point out the docstring in source code explained this.

I also agree that when adjusted, yfinance should add "Adj " prefixes. Problem is today will break a lot of users code - would have been easier 5 years ago at start. I propose best solution is better documentation.

RudyNL commented 1 year ago

I am just realizing the the dividend should also be adjusted. If not, the dividend in the history is "increasing" and can even become larger then the stock price. So I would prefer "history" giving the result "Dividend" and "adj_history" giving the result Adj Dividend. In both cases the dividend percentage Dividend/Close or Adj Dividend/Adj Close remains equal.

RudyNL commented 1 year ago

ValueRaider, you are wrong. Correcting this by better documentation won't work.

  1. People are not reading the documentation and if they are, the are reading it sloppy and not accurately.
  2. Intuitive software is always to be preferred above software violating your intuition.
  3. People are checking the results with the Yahoo site. Deviations from the Yahoo site will result in issues to be reported.
  4. Depending on the application you need either the corrected or the non corrected set of data. So preferably both should be provided.
  5. The deviations are so small that till now nobody noticed this problem. So I doubt if it would break any users code.

I stick to my opinion. "history" should produce a table [ Open, Low, High, Close, Dividend, Stock Splits ] with the correct values and "adj_history" should produce a table [ Adj Open, Adj Low, Adj High, Adj Close, Adj Dividend, Stock Splits ]

ValueRaider commented 1 year ago

Can't blame users for not reading documentation that doesn't exist - other than the docstring in source code, was not documented how history handles adjustment. This isn't the first time that's caused an Issue being raised e.g. #798.

I thought codes would break because history()["Close"] would raise a KeyError, but I misunderstood your proposal.

Maybe implement your idea and request users test & feedback.

RudyNL commented 1 year ago

With the current software all needed values can be retrieved or computed. The default call is:

>>> import yfinance as yf
>>> msft = yf.Ticker("MSFT")
>>> msft.history(auto_adjust=True)
                                 Open        High         Low       Close    Volume  Dividends  Stock Splits
Date                                                                                                        
2022-12-28 00:00:00-05:00  236.889999  239.720001  234.169998  234.529999  17457100        0.0           0.0
2022-12-29 00:00:00-05:00  235.649994  241.919998  235.649994  241.009995  19770700        0.0           0.0
2022-12-30 00:00:00-05:00  238.210007  239.960007  236.660004  239.820007  21930800        0.0           0.0

It produces: Open: the adjusted opening of the stock High: the adjusted highest value of the stock Low: the adjusted lowest value of the stock Close: the adjusted closing of the stock Dividends: the non adjusted or paid dividends of the stock Notice the highly misleading headers of the columns The alternative call is:

>>> import yfinance as yf
>>> msft = yf.Ticker("MSFT")>>> msft.history(auto_adjust=False)
                                 Open        High         Low       Close   Adj Close    Volume  Dividends  Stock Splits
Date                                                                                                                    
2022-12-28 00:00:00-05:00  236.889999  239.720001  234.169998  234.529999  234.529999  17457100        0.0           0.0
2022-12-29 00:00:00-05:00  235.649994  241.919998  235.649994  241.009995  241.009995  19770700        0.0           0.0
2022-12-30 00:00:00-05:00  238.210007  239.960007  236.660004  239.820007  239.820007  21930800        0.0           0.0

It produces: Open: the non adjusted opening of the stock High: the non adjusted highest value of the stock Low: the non adjusted lowest value of the stock Close: the non adjusted closing of the stock Adj Close: the adjusted closing of the stock Dividends: the non adjusted or paid dividends of the stock These are the values which are shown at the Yahoo site. Any adjusted value can be computed by:

Adj Open = Open * Adj Close / Close
Adj Dividends = Dividends * Adj Close / Close