JECSand / yahoofinancials

A powerful financial data module used for pulling data from Yahoo Finance. This module can pull fundamental and technical data for stocks, indexes, currencies, cryptos, ETFs, Mutual Funds, U.S. Treasuries, and commodity futures.
https://pypi.python.org/pypi/yahoofinancials
MIT License
896 stars 214 forks source link

Is it me, or is yahoofinancials on the slow side ? #148

Closed RobinDeeCee closed 12 months ago

RobinDeeCee commented 1 year ago

Hey

I just strated to use this in my project (what takes 7000 in a weekly basis from yahoo and I was wondering what cases the slowness and if this is on my or on this github project.

I writin a small script here, about how I kinda do it in my project. for me it takes sometimes 2/3 minutes before getting the details of the stock.

I'm a missing something ?

SourceYahoofinancialsObject = SourceYahoofinancials.getYahoofinancialsObject(company_description)

logger.info('-' * 65)
logger.info('Yahoo Financials info ...')
logger.info('-' * 65)2

SummaryYearly = SourceYahoofinancialsObject.get_financial_stmts('annual', ['income', 'cash', 'balance'])
SummaryQuarterly = SourceYahoofinancialsObject.get_financial_stmts('quarterly', ['income', 'cash', 'balance'])

Ydata["StockInfoData"] = SourceYahoofinancialsObject.get_stock_quote_type_data()
Ydata["StockSummary"] = SourceYahoofinancialsObject.get_summary_data()
Ydata["StockKeyStatistics"] = SourceYahoofinancialsObject.get_key_statistics_data()
Ydata["StockProfileData"] = SourceYahoofinancialsObject.get_stock_profile_data()
Ydata['finDataYearly'] = SummaryYearly
Ydata['finDataQuarterly'] = SummaryQ

#Check if all exist
 if (Ydata["StockInfoData"] and
    Ydata["StockSummary"] and
    Ydata["StockKeyStatistics"] and
    Ydata["StockProfileData"] and
    Ydata['finDataYearly'] and
    Ydata['finDataQuarterly']
):
     return Ydata
JECSand commented 1 year ago

@RobinDeeCee Thank you for your feedback.

Did you try running YahooFinancials in concurrency mode? You'll see significantly improved performance when simultaneously pulling data for multiple tickers.

i.e: yahoo_financials = YahooFinancials(tickers, concurrent=True, max_workers=8, country="US")

msh855 commented 1 year ago

I have using the new version, but is extremely slow. Slower than looping over millions of data, I would say. The parallelisation for some reason is not working properly.

And often I get errors like this:


yahoo_financials.get_stock_profile_data()

multiprocessing.pool.RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/Users/safishajjouz/opt/anaconda3/envs/obb/lib/python3.9/multiprocessing/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
  File "/Users/safishajjouz/opt/anaconda3/envs/obb/lib/python3.9/multiprocessing/pool.py", line 48, in mapstar
    return list(map(*args))
  File "/Users/safishajjouz/opt/anaconda3/envs/obb/lib/python3.9/site-packages/yahoofinancials/etl.py", line 499, in _create_dict_ent
    re_data = self._get_historical_data(YAHOO_URL, r_map, tech_type, statement_type)
  File "/Users/safishajjouz/opt/anaconda3/envs/obb/lib/python3.9/site-packages/yahoofinancials/etl.py", line 237, in _get_historical_data
    self._request_handler(url, config.get("response_field"))
  File "/Users/safishajjouz/opt/anaconda3/envs/obb/lib/python3.9/site-packages/yahoofinancials/etl.py", line 198, in _request_handler
    raise ManagedException("Server replied with HTTP " + str(response.status_code) +
yahoofinancials.etl.ManagedException: Server replied with HTTP 404 code while opening the url: https://query2.finance.yahoo.com/v10/finance/quoteSummary/smt.l?modules=assetProfile&formatted=False&lang=en-US&region=US&corsDomain=finance.yahoo.com
"""
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
  File "/Users/safishajjouz/opt/anaconda3/envs/obb/lib/python3.9/site-packages/IPython/core/interactiveshell.py", line 3378, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-22-551664ebc37c>", line 1, in <module>
    yahoo_financials.get_stock_profile_data()
  File "/Users/safishajjouz/opt/anaconda3/envs/obb/lib/python3.9/site-packages/yahoofinancials/yf.py", line 114, in get_stock_profile_data
    self.get_stock_data(statement_type='profile', tech_type='assetProfile', report_name='assetProfile'),
  File "/Users/safishajjouz/opt/anaconda3/envs/obb/lib/python3.9/site-packages/yahoofinancials/etl.py", line 548, in get_stock_data
    dict_ents = pool.map(partial(self._create_dict_ent,
  File "/Users/safishajjouz/opt/anaconda3/envs/obb/lib/python3.9/multiprocessing/pool.py", line 364, in map
    return self._map_async(func, iterable, mapstar, chunksize).get()
  File "/Users/safishajjouz/opt/anaconda3/envs/obb/lib/python3.9/multiprocessing/pool.py", line 771, in get
    raise self._value
yahoofinancials.etl.ManagedException: Server replied with HTTP 404 code while opening the url: https://query2.finance.yahoo.com/v10/finance/quoteSummary/smt.l?modules=assetProfile&formatted=False&lang=en-US&region=US&corsDomain=finance.yahoo.com
JECSand commented 1 year ago

@msh855 @RobinDeeCee

I just pushed v1.15. I attempted to improve how yf handles the random 404 errors. If this issue continues, could you provide me with some of the symbols you are trying to run?

JECSand commented 12 months ago

@RobinDeeCee Closing this issue due to inactivity. If you are still experiencing unexpected slowness, just open a new issue and we can revisit.