atreadw1492 / yahoo_fin

Scrape stock price history from new (Spring 2017) Yahoo Finance layout
MIT License
286 stars 125 forks source link

IndexError: list index out of range #66

Open bobdabuilder1 opened 2 years ago

bobdabuilder1 commented 2 years ago

On version 0.8.9.1 I keep randomly getting a list index out of range error. It will happen on a ticker and then the next time it won't.

Traceback (most recent call last): File "yahoofintest.py", line 153, in Earnings_History = si.get_earnings_history(i) File "/home//.local/lib/python3.8/site-packages/yahoo_fin/stock_info.py", line 836, in get_earnings_history result = _parse_earnings_json(url) File "/home//.local/lib/python3.8/site-packages/yahoo_fin/stock_info.py", line 809, in _parse_earnings_json page_data = [row for row in content.split( IndexError: list index out of range

josiahswanson commented 2 years ago

I am experiencing this problem, too. In a for loop, it will actually start happening with one ticker and then keep erroring on the remaining tickers in the loop.

daanvonkHOB commented 2 years ago

Experiencing similar error.

Sometimes it does succesfully load get_stats_valuation for a single ticker, but within the for loop all tickers get this error.

Edit:

I'm trying to reproduce the problem. The problem is not in the code, but in yahoo finance. I'm facing the problem with get_stats_valuation. I've encountered the error in the foor loop with the EBAY stock. When going to the EBAY stock statistics on the yahoo finance website, the Valuation Measures were not able to load, see image. This is the exact page that te function is scraping from, so if the table isn't there, it won't load as a DataFrame.

image

The function:

def get_stats_valuation(ticker, headers = {'User-agent': 'Mozilla/5.0'}):

    '''Scrapes Valuation Measures table from the statistics tab on Yahoo Finance 
       for an input ticker 

       @param: ticker
    '''

    stats_site = "https://finance.yahoo.com/quote/" + ticker + \
                 "/key-statistics?p=" + ticker

    tables = pd.read_html(requests.get(stats_site, headers=headers).text)

    tables = [table for table in tables if "Trailing P/E" in table.iloc[:,0].tolist()]

    table = tables[0].reset_index(drop = True)

    return table

When Valuation Measures aren't loaded on Yahoo finance, there is no table with "Trailing P/E". Therefor, tables is an empty list. Tables[0] will return an index error, since the list is empty and there is no index. Sometimes stats_valuation is loaded and sometimes it isn't.

The only fix that I can think of to ensure you load all tickers in the list, is creating a while loop with sleeptime. Waiting x seconds and try again untill it is loaded. This will cost more time for sure, so not a good fix for thousands of stocks. But better than nothing :) Get rid of the attempts and print statements if you don't care about the progress and adjust sleeptime if you want to.

for ticker in ticker_list:
    attempt = 1
    time.sleep(3)
    error = False
    try:
        x = si.get_stats_valuation(ticker)
        print(f"{ticker} loaded on first attempt")
    except:
        error = True
        attempt += 1
        while error == True:
            print(f"Sleeping {attempt*10}s")
            time.sleep(attempt*10)
            try:
                x = si.get_stats_valuation(ticker)
                error = False
                print(f"{ticker}, succesful on attempt {attempt}")
            except:
                attempt += 1
                print(f"{ticker}, failed. Attempt {attempt}") 

Output:

EA loaded on first attempt
EMR loaded on first attempt
ENPH loaded on first attempt
Sleeping 20s
EOG, failed. Attempt 3
Sleeping 30s
EOG, succesful on attempt 3
EFX loaded on first attempt
Sleeping 20s
EL, failed. Attempt 3
Sleeping 30s
EL, succesful on attempt 3
Sleeping 20s
ETSY, succesful on attempt 2
EXPE loaded on first attempt
EXPD loaded on first attempt
XOM loaded on first attempt
Sleeping 20s
FFIV, succesful on attempt 2
FB loaded on first attempt
FAST loaded on first attempt
Sleeping 20s
FDX, succesful on attempt 2
FIS loaded on first attempt
FISV loaded on first attempt
mpainenz commented 2 years ago

I ran into this issue with the Earnings Calendar, and now when I try to view the Earnings calendar from my Browser, I'm redirected to the homepage for Yahoo.

I wanted to post here, in case others experience this, as I was blocked because I made too many API calls presumably. If you have this issue, check with your browser that you can get to the corresponding page of the service you are accessing via python.

I managed to get around this issue by using a Proxy. The Requests pakage that yahoo_fin uses, allows you to specify a proxy via windows command line (which is handy).

The real issue as such, is that I shouldn't hammer the API in my code, but as a workaround until you are unblocked, a proxy is one option.

arup1221 commented 3 months ago

i am getting data = tables[0].append(tables[1]) IndexError: list index out of range

can anyone have any kind of solution or I am got blocked?

chinmayagrawal775 commented 3 months ago

I am also getting this error.

actually tables[1] does not exists..

Looks like the API reponse has been changes, by yahoo-fin package has not been updated..