dpguthrie / yahooquery

Python wrapper for an unofficial Yahoo Finance API
https://yahooquery.dpguthrie.com
MIT License
778 stars 137 forks source link

Getting older data - is there a usage limit? #151

Open EnigmaNZ opened 1 year ago

EnigmaNZ commented 1 year ago

Describe the bug I apologise if this has already been covered but I couldn't find a title that appeared to match:

I am grabbing a bit of data using the different functions listed below (in one script) and am getting up to date information on the initial one function but older info in the later functions (compared with the live values in Yahoo Finance). I've listed below the functions used and whether up to date or not: Within financial_data: currentPrice = Up to Date totalCash = Up to Date totalDebt = Up to Date Within index_trend: +5y growth estimates = Out of Date Within key_stats: priceToBook = Out of Date enterpriseValue = Out of Date Within summary_detail: trailingPE = Out of Date priceToSalesTrailing12Months = Out of Date marketCap = Out of Date Within cash_flow: FreeCashFlow = Out of Date

To Reproduce Attached a text file with the for function that's pulling the data of 15 stocks

Expected behavior Up-to-date information

Desktop (please complete the following information):

EnigmaNZ commented 1 year ago

YahooQueryIssue.txt

Unable to load the text file in the main so have attached here

dpguthrie commented 1 year ago

Do you have the symbol or symbols where we can view this behavior? And, what you’re expecting vs what’s being returned.

EnigmaNZ commented 1 year ago

Yep, list of symbols are aapl, amzn, axp, baba, bac, bcs, cepu, goog, icl, jnj, jpm, msft, oxy, ozk, pnc, rf, wfc, zim

I tried to upload the Database to the

Main stock I have been comparing with is 'aapl' (first in the list).

For the functions above (just tested 20 minutes ago - aapl): Within financial_data: currentPrice = 152.55 (matches yahoo finance) totalCash = 5.1355B (yahoo finance = 51.36B - matches) totalDebt = 111.109997B (yahoo finance = 111.1B - matches) Within index_trend: +5y growth estimates = 8.284% (yahoo finance = 8.13%) Within key_stats: priceToBook = 42.5998 (yahoo finance = 42.87) enterpriseValue = 2.4734B (yahoo finance = 2.49T) Within summary_detail: trailingPE = 25.8998 (yahoo finance = 26.1) priceToSalesTrailing12Months = 6.228 (yahoo finance = 6.42) marketCap = 2.414T (yahoo finance = 2.43T) Within cash_flow: FreeCashFlow = 111,443,000 (yahoo finance = 111,443,000 - matches) sorry this one was also correct

Attached is a screenshot of all the data that has been produced image

EnigmaNZ commented 1 year ago

It seems there may be a fix if you wait longer between calls. Is this something that would be worth trying?

dpguthrie commented 1 year ago

Ya, my guess is that YF is providing bad data when it detects programmatic requests with little time between. Might be a good idea to put in some time.sleep() in between requests. Probably more important to get accurate data than it is to get it super fast. This should most likely be a class level property that you set during instantiation. Something like:

class Ticker:
    def __init__(self, ...):
        ... # all the other stuff
        time.sleep_between = kwargs.get('sleep_between', SLEEP_BETWEEN_DEFAULT)

Then the internal requests methods would need to be refactored slightly to account for that.

ValueRaider commented 1 year ago

YF is providing bad data when it detects programmatic requests with little time between

Odd data issues do arise when spamming Yahoo, but cannot say if intentional e.g. might be different data sources?

IMO best way to rate-limit is a specialised module like pyrate-limiter, or requests_ratelimiter if want to combine with requests-cache (example on yfinance README). Then just pass the session object to yq.

dpguthrie commented 1 year ago

IMO best way to rate-limit is a specialised module like pyrate-limiter, or requests_ratelimiter if want to combine with requests-cache (example on yfinance README). Then just pass the session object to yq.

Much better way than I described above. Thanks @ValueRaider

EnigmaNZ commented 1 year ago

Awesome, cheers @dpguthrie and @ValueRider. I have tried the time.sleep at 1 sec and 5 sec and didn't work for me. Someone else in the discussions said that 1sec (time.sleep(1)) worked for them with only financial_data.

Planning to give the IMO a trial now thank you!

EnigmaNZ commented 1 year ago

Hi guys, not sure if I'm doing something wrong but I trialed the following with @ValueRaider @dpguthrie:

`import requests import requests_cache from ratelimiter import RateLimiter

requests_cache.install_cache('yahoo_api_cache', expire_after=3600)

rate_limiter = RateLimiter(max_calls=1, period=5) session = requests_cache.CachedSession() session.headers.update({'User-Agent': 'Mozilla/5.0'})

for ticker in Stocks: temp_T = Ticker(ticker, session=session) with rate_limiter: tempFinData = temp_T.financial_data tempIndexTrend = temp_T.index_trend tempKeyStats = temp_T.key_stats tempSummDet = temp_T.summary_detail MyStocks.txt

    tempSummProf = temp_T.summary_profile
    temp_cashFlow = temp_T.cash_flow(trailing=False)'
EnigmaNZ commented 1 year ago

Mistakenly closed

ValueRaider commented 1 year ago

@EnigmaNZ Your code looks like what ChatGPT would generate - nonsense. Just copy-paste the example in yfinance README.

EnigmaNZ commented 1 year ago

@ValueRaider Very true about ChatGPT. I pulled the logic from ChatGPT (ended up being nonsense as you said) but I could understand the logic behind what was produced by it so could attempt to fix in comparison to yfinance example.

Prior to the mess around with ChatGPT, I attempted to copy-paste the example from yfinance readme but ended up with the screenshot errors attached (unfortunately for me this is where I turned to ChatGPT to see if I could try combine them and understand a bit more of how this function failed - didn't work though...). Also trialed with "yfinance.cache" to see if I needed to directly copy paste to get the same error. I'm pretty confident there is going to be a really simple fix to this but I have no idea where sorry. image

Also attached two versions of the code (with and without the commented out code for simplicity if it helps to not have the mess) MyStocks_withoutMess.txt MyStocks.txt

ValueRaider commented 1 year ago

Ah, with yahooquery you also need to set user-agent. Look inside yfinance source code data.py

@dpguthrie Might be worth yqautomatically setting session user-agent if missing.

EnigmaNZ commented 1 year ago

That definitely solved this issue for me. Ended up just using session.headers.update({'User-Agent': 'Mozilla/5.0.}) just below the yfinance software as per screenshot. image

Unfortunately I have got no further though. It makes me think that YahooFinance is messing with this data somehow. As you can see I updated the requestrate to 1 request per minute and it is still receiving bad data. Will look through some of the other functions and see if I can pull some accurate data in other ways.

AAPL current P/B = 41.53 (website). Read P/B = 40.969 for example.