Closed H-Ali13381 closed 8 hours ago
Simple fix
A quick fix I found is to rename the columns:
data.index.name = 'Date' data.columns = ['Adj Close', 'Close', 'High', 'Low', 'Open', 'Volume']
I can't reproduce integer index - is that the problem?
>>> yf.download('SPY', session=session)
[*********************100%***********************] 1 of 1 completed
Price Adj Close Close High Low Open Volume
Ticker SPY SPY SPY SPY SPY SPY
Date
1993-01-29 00:00:00+00:00 24.608625 43.937500 43.968750 43.750000 43.968750 1003200
...
Integer index wasnt the problem for me, it was the ticker index. his PR solves the issue.
If you are only fetching one symbol, why not use yf.Ticker('SPY').history()
? @antoniouaa
@cyrom0
The output is different from 0.2.47 to 0.2.48, I have downgraded to 0.2.47 and the output looks good. This is the output of 0.2.47.
data = yf.download('SPY') [*100%***] 1 of 1 completed data.head() Adj Close Close High Low Open Volume Date
1993-01-29 00:00:00+00:00 24.608624 43.93750 43.96875 43.75000 43.96875 1003200 1993-02-01 00:00:00+00:00 24.783661 44.25000 44.25000 43.96875 43.96875 480500 1993-02-02 00:00:00+00:00 24.836142 44.34375 44.37500 44.12500 44.21875 201300 1993-02-03 00:00:00+00:00 25.098690 44.81250 44.84375 44.37500 44.40625 529400 1993-02-04 00:00:00+00:00 25.203718 45.00000 45.09375 44.46875 44.96875 531500
As you can see, the header/index from 0.2.48 is different with: Price Adj Close Close High Low Open Volume Ticker SPY SPY SPY SPY SPY SPY Date
If you are only fetching one symbol, why not use
yf.Ticker('SPY').history()
? @antoniouaa @cyrom0
I wasn't aware that I can do the same using yf.Ticker('SPY').history(). I upgraded to 0.2.49 and sees that data = yf.Ticker('SPY').history(period="max", interval='1d', actions=False, auto_adjust=False) and data.index = pd.to_datetime(data.index, utc=True) will do the equivalent output as yf.download('SPY') for my purposes.
Tracing the code a bit and based on my limited understand, yf.download("SPY") has wrapper to handle multiple tickers so yf.Ticker('SPY').history() looks like is more efficient for one ticker case. I will modify my code to use yf.Ticker('SPY').history(). Basically, there are some additional processing in yf.download('SPY'), such pd.to_datetime(data.index, utc=True) compares to yf.Ticker('SPY').history(). The output of yf.Ticker('SPY').history() will not be able to treated the index as datetime without calling to pd.to_datetime(data.index, utc=True).
Thanks for your help.
Describe bug
yf.download() returning incorrect index
Simple code that reproduces your problem
import yfinance as yf data = yf.download('SPY') data.head()
Debug log
DEBUG Entering download() DEBUG Disabling multithreading because DEBUG logging enabled DEBUG Entering history() DEBUG Entering history() DEBUG SPY: Yahoo GET parameters: {'period1': '1925-11-20 12:49:42-05:00', 'period2': '2024-10-26 13:49:42-04:00', 'interval': '1d', 'includePrePost': False, 'events': 'div,splits,capitalGains'} DEBUG Entering get() DEBUG Entering _make_request() DEBUG url=https://query2.finance.yahoo.com/v8/finance/chart/SPY DEBUG params={'period1': -1392099018, 'period2': 1729964982, 'interval': '1d', 'includePrePost': False, 'events': 'div,splits,capitalGains'} DEBUG Entering _get_cookie_and_crumb() DEBUG cookie_mode = 'basic' DEBUG Entering _get_cookie_and_crumb_basic() DEBUG reusing cookie DEBUG reusing crumb DEBUG Exiting _get_cookie_and_crumb_basic() DEBUG Exiting _get_cookie_and_crumb() DEBUG response code=200 DEBUG Exiting _make_request() DEBUG Exiting get() DEBUG SPY: yfinance received OHLC data: 1993-01-29 14:30:00 -> 2024-10-25 13:30:00 DEBUG SPY: OHLC after cleaning: 1993-01-29 09:30:00-05:00 -> 2024-10-25 09:30:00-04:00 DEBUG SPY: OHLC after combining events: 1993-01-29 00:00:00-05:00 -> 2024-10-25 00:00:00-04:00 DEBUG SPY: yfinance returning OHLC: 1993-01-29 00:00:00-05:00 -> 2024-10-25 00:00:00-04:00 DEBUG Exiting history() DEBUG Exiting history() DEBUG Exiting download()
Bad data proof
No response
yfinance
version0.2.48
Python version
No response
Operating system
Windows