dpguthrie / yahooquery

Python wrapper for an unofficial Yahoo Finance API
https://yahooquery.dpguthrie.com
MIT License
782 stars 139 forks source link

API calls returning 'Invalid Cookie' #203

Open benhowell opened 1 year ago

benhowell commented 1 year ago

Describe the bug API calls returning 'Invalid Cookie'

To Reproduce

from yahooquery import Ticker aapl = Ticker('aapl') x = aapl.price print(x) {'aapl': 'Invalid Cookie'}

Desktop (please complete the following information):

o-oayda commented 1 year ago

I can confirm I have the same issue (macOS 13.4.1, Python 3.9.10 and yahooquery 2.3.1).

EmirEgilli commented 1 year ago

image

Looks about the same with me as well. Didn't have this issue yesterday. Also made sure that yahooquery is up-to-date (2.3.1).

jfreidkes commented 1 year ago

Same here both in Linux and macOS, every ticker returns Invalid Cookie

staper1960 commented 1 year ago

Same here too. Yesterday it was fine. However, the day before yesterday, yahooquery was very unstable. Sometimes it returned correct results, sometimes no attributes were returned. The whole yahoo service is becoming unreliable. Any suggestions?

weiguang-zz commented 1 year ago

same result for me

james-stevens commented 1 year ago

same, here's the smallest code I have to reproduce

#! /usr/bin/python3

import json

from yahooquery import Ticker

tickers = Ticker(["JD.L","TSLA"])
print(json.dumps(tickers.price,indent=3))

this outputs

{
   "JD.L": "Invalid Cookie",
   "TSLA": "Invalid Cookie"
}

Seemed ok till about 8pm UTC

dpguthrie commented 1 year ago

Looks like it's back, at least using the examples from above. Curious if you're all seeing that now? Pretty disconcerting though nonetheless as it appears they're most likely phasing these APIs out for requests without proper cookies set by the browser. I could come up with a workaround but it would most likely require selenium, which is less than ideal.

image

b3niup commented 1 year ago

Unfortunately it still doesn't work for me - even same example as above.

staper1960 commented 1 year ago

Still not working for me either. I have a long list of queries on various stocks and it returns the "invalid cookie" message to the very first one.

jirisarri10 commented 1 year ago

Hello. I just tried to connect and I also get the error "Invalid Cookie" . I have linux 6.1, python 3.11.3 and yahooquery 2.3.1. I think it is a general bug in yahoo service. I await further news. I really appreciate your project and dedication.

imagen

staper1960 commented 1 year ago

python 3.7, yahooquery 2.3.1, no browser, windows 11 here. Location: Greece.

ValueRaider commented 1 year ago

Given code works for @dpguthrie, I think it will be helpful if everyone adds their location to their report. Maybe Yahoo is slowly rolling out across globe. Just edit your post, don't spam the thread.

Me: UK, Invalid Cookie, Linux & everything up-to-date

tretus222 commented 1 year ago

Same issue since yesterday (but yesterday only 10% of the requests failed) python 3.9.2, linux, python 3.10.6, macOS, yahooquery 2.3.1, Location: Germany.

james-stevens commented 1 year ago

Still not working for me too - as before 100% failure since about 8pm yesterday Python 3.11.3, yq v2.3.1, Alpine Linux v3.18.0, UK

{
   "JD.L": "Invalid Cookie",
   "TSLA": "Invalid Cookie"
}
kajdo commented 1 year ago

same for me

Python 3.9.16 - location austria

{
   "JD.L": "Invalid Cookie",
   "TSLA": "Invalid Cookie"
}

I use my account cookie already for requesting other depot data calling yahoo api directly - could I use it for the calls for yahooquery as well

as prev mentioned a selenium impl might not be ideal and a bit of an overkill - I read the cookie out of firefox db and just log in from time to time - its valid for a good number of days

louisowen6 commented 1 year ago

same for me Python 3.9.0, yq v2.3.1, location Indonesia

{'aapl': 'Invalid Cookie'}
ValueRaider commented 1 year ago

@kajdo So cookie could be cached, but what about the crumb?

dpguthrie commented 1 year ago

Ya, I'm not super stoked on adding selenium either - if anyone has a better idea for getting the appropriate cookies, I'm all ears. I do think that the library itself should be responsible for setting up the request in a way that makes valid requests, which means obtaining cookies and a crumb without intervention from the user. I guess it could certainly be another argument to the Ticker initialization but I'm less enthused about that (but definitely open to the idea if others think it's valuable).

The alternative that I mentioned above is using selenium. I put together a short loom video on what that might look like (it adds a little less than 5 seconds to the Ticker initialization, which isn't terrible). You can see that after I've obtained both cookies and a crumb, I'm now able to hit the quotes endpoint, which is something they recently made unavailable to requests without cookies and/or crumb.

Also, I think using the webdriver-manager package abstracts away the requirement of a user having to download chromedriver themselves, which makes it a little bit easier to stomach.

kajdo commented 1 year ago

@kajdo So cookie could be cached, but what about the crumb?

@ValueRaider I messed around with crumb for the api calls ... also with the cookie itself which i see in the dev console in ff / brave

the call i make is basically against "/v7/finance/desktop/portfolio?formatted=true" and i ended up using just header

urlparams:

to call the /portfolio endpoint it seems to be fine without crumb

@dpguthrie i do agree that its much better if the library handles it .... to reduce the +5sec the lib could cache/persist it in a .env file and load it via dotenv (https://pypi.org/project/python-dotenv/) or something like that .... so it only gets a new token with the browser-simulation if really needed ...... as mentioned the cookie is valid for days - not sure how long but for my script, i think i have to change it once every 2 weeks max

don't really have a better suggestion .... messed around a bit, but ended up reading it out from ff cookie db because i'm online on yahoo finance anyway often enough to not redo the tocken manually

manuelwithbmw commented 1 year ago

Me since today 13th July: UK, Invalid Cookie, Python 3.8, Mac OS Mohave 10.14.6, yfinance version: 0.2.22, yahooquery version: 2.3.0

jirisarri10 commented 1 year ago

It appears that the Yahooquery library attempts to access the Yahoo Finance API using authentication cookies. However, Yahoo Finance has made changes to its API, and it's possible that the authentication cookies are no longer valid or need to be updated.

staper1960 commented 1 year ago

We are still in the dark it seems. Sorry I don't have the necessary skills to contribute to the discussion, but I sincerely hope that we will soon have a practical solution accompanied by a comprehensive explanation and we can return to our trading habits. I wish to thank Doug in advance for his superb work and his commitment to yahooquery.

KenLee12323 commented 1 year ago

just encountered this issue. utc+8.

From yfinance repo, users are also discussing this issue. Some said that "A1 and the crumb URL parameter" are needed at least for the call. I don't know how to get the crumb and cookie through python without additional libraries or tools. I would be really appreciate if anyone could give me some hints.

ps. I notice the 'https://dpguthrie-yahooquery-streamlit-app-eydpjo.streamlit.app/' website still works. I am not sure how can it still work if it is using yahooquery's api? Would anyone mind giving me some hints?

I think selenium is the right choice to adopt since yahoo is making it harder and harder for programmatic crawling. Finally, we will reach a point where crawling with a fake browser is the only choice.

Thanks a lot for the work!

**the request response now changed from 401 to 404. Also, the 'https://dpguthrie-yahooquery-streamlit-app-eydpjo.streamlit.app/' website starts getting 'invalid cookies' respond now.

cmjordan42 commented 1 year ago

If they are rolling out the breaking change regionally, it hit EST / UTC-5 sometime between ~18:00 and ~21:00 UTC-5

I was manually running a job which I halted at 17:40 UTC-5 - upon just resuming to finish the job at 21:00 UTC-5 I now encounter this error.

Keepcase commented 1 year ago

I just started seeing this error today as well.

Musa830 commented 1 year ago

Is this error unsolvable?

galashour commented 1 year ago

Same issue (Israel, Python 3.11)

jirisarri10 commented 1 year ago

import yahooquery tub=yahooquery.Ticker("TUB.MC") historial = tub.history(period="1d") print (historial) open high low close volume adjclose symbol date
TUB.MC 2023-07-14 11:27:47+02:00 2.89 2.94 2.89 2.93 21885 2.93 tub.price {'TUB.MC': 'Invalid Cookie'}

If you ask history it works perfectly!! But if you ask for price "Invalid Cookie".

Musa830 commented 1 year ago

I think this is a very unstable way of getting data. After the try is triggered, you can add the code of yfinance to the exception, so that the data can still be obtained when the exception occurs. This may require code changes, but will not delay daily backtesting.

kajdo commented 1 year ago

@dpguthrie maybe a crazy idea - but what if we could make the login process work with mechanize (https://mechanize.readthedocs.io/en/latest/) ... the benefit would be that no seperate browser is needed and it would run on a vps without desktop environment as well

something like:

import mechanize

# Create a browser instance
browser = mechanize.Browser()

# Open the login URL
login_url = 'https://login.yahoo.com/?done=https%3A%2F%2Fwww.yahoo.com%2F&add=1'
response = browser.open(login_url)

# Read the HTML response and decode it as a string
html = response.read().decode('utf-8')

# Find the login form with id="login-username-form"
forms = [form for form in browser.forms() if form.attrs.get('id') == 'login-username-form']
if forms:
    # Select the first form found
    form = forms[0]

    # Find the username input field
    username_field = form.find_control('username')
    if username_field:
        # Assign the value to the username input field
        username_field.value = 'USERNAME'

        # Find the submit button and directly submit the form
        submit_button = form.find_control(id='login-signin', type='submit')
        if submit_button:
            browser.form = form  # Set the form explicitly
            try:
                browser.submit()
                # Print the current URL after submitting the form
                print("Current URL:", browser.geturl())
            except mechanize.HTTPError as e:
                print(f"Failed to submit the form: {e.code} {e.msg}")
            except mechanize.URLError as e:
                print(f"Failed to submit the form: {e.reason}")
        else:
            print("The submit button was not found in the form.")
    else:
        print("The username input field was not found in the form.")
else:
    print("The form with id='login-username-form' was not found in the HTML.")

# Close the browser
browser.close()

just as an example ... it still ends up in Failed to submit the form: 403 b'request disallowed by robots.txt', but i guess i'd need to add more parameters to the form to make it work correct

chfiii commented 1 year ago

I looked at dpguthrie's little video but was unable to reproduce it. No cookies or crumb. I am logged into finance.yahoo.com on a Chrome browser. I've tried on both the YQ version from Anaconda (2.2.15) and pip (2.3.1) and get empty coolie jar and blank crumb. Could someone give more explicit instructions to get this done? Do I need to include selenium in my data handler code that calls Ticker? Can I run a separate instance of some code that does this? I currently depend on the Ticker code and really don't want to go back to yfinance

dpguthrie commented 1 year ago

I looked at dpguthrie's little video but was unable to reproduce it. No cookies or crumb. I am logged into finance.yahoo.com on a Chrome browser. I've tried on both the YQ version from Anaconda (2.2.15) and pip (2.3.1) and get empty coolie jar and blank crumb. Could someone give more explicit instructions to get this done? Do I need to include selenium in my data handler code that calls Ticker? Can I run a separate instance of some code that does this? I currently depend on the Ticker code and really don't want to go back to yfinance

I haven't put this into the YQ codebase yet, it's only on my local machine at the moment as more of a proof-of-concept. Happy to push the branch up for you to play around with though. But, I still want to think about if this is the right solution going forward.

KenLee12323 commented 1 year ago

I looked at dpguthrie's little video but was unable to reproduce it. No cookies or crumb. I am logged into finance.yahoo.com on a Chrome browser. I've tried on both the YQ version from Anaconda (2.2.15) and pip (2.3.1) and get empty coolie jar and blank crumb. Could someone give more explicit instructions to get this done? Do I need to include selenium in my data handler code that calls Ticker? Can I run a separate instance of some code that does this? I currently depend on the Ticker code and really don't want to go back to yfinance

I tihnk he didnt release the code yet because he is thinking how to make it user-friendly to install and use. Also, i think we still have to wait until yahoo's update to become stable.

wmorgansoftware commented 1 year ago

Linux, Python 3.10.6, yahooquery 2.3.1, USA. Tried a couple minutes ago. I assume the ongoing battle of trying to get real-time data and Yahoo trying to shut down that access? No offense to this well coded project for what it has needed to do, but I wonder if there are any other real-time stock tracking API providers out there. My broker does 15-20 minute delays, sadly....

KenLee12323 commented 1 year ago

@dpguthrie Are you having other alternatives in mind apart from selenium?

Really appreciate your work!

cmjordan42 commented 1 year ago

Linux, Python 3.10.6, yahooquery 2.3.1, USA. Tried a couple minutes ago. I assume the ongoing battle of trying to get real-time data and Yahoo trying to shut down that access? No offense to this well coded project for what it has needed to do, but I wonder if there are any other real-time stock tracking API providers out there. My broker does 15-20 minute delays, sadly....

I believe yahooquery to be the best currently. There was a large contingent of users of yfinance who did a comparison and many of us deemed yahooquery to be the one to focus investment into.

There are other API providers. Alpha Vantage is a great paid option.

dpguthrie commented 1 year ago

@dpguthrie Are you having other alternatives in mind apart from selenium?

I briefly looked into @kajdo's recommendation above using mechanize. I don't know a lot about that package but it seems like it doesn't have the overhead of something like selenium. The problem with this approach though is it doesn't set the appropriate cookies after making a request to YF, which is really what we need. I followed this SO answer about retrieving cookies, just replacing google with finance.yahoo.com, and no cookies were set.

Outside of Selenium, I don't have any other alternatives in mind - but like I said above, happy to take some suggestions.

KenLee12323 commented 1 year ago

Sorry that I can't give much contribution to the web related knowledge as I am not an expert in this domain. But my way to think about this incident is - 'how many updates are still there waiting for us'. Judging form this perspective, my understanding is that selenium is more prepared for yahoo's future updates, though it's heavy (As far as I know, mechanize doesn't support javascript? ). I hope @kajdo could give me come insights in this ❤️.

And I think it could be a chance for us to pull out the parts that yahoo could set challenge on in the future so that we won't waste too much effort changing the structure every time.

chfiii commented 1 year ago

One other question, if the selenium piece fixes the 'price' entry, will it also fix 'quotes'?

wmorgansoftware commented 1 year ago

I believe yahooquery to be the best currently. There was a large contingent of users of yfinance who did a comparison and many of us deemed yahooquery to be the one to focus investment into.

Quickly tried going back to yfinance, and looks like they're getting 401 errors when trying to pull up even the basic ticker info. Looks like Yahoo has made their move, and now it's counteract time....

kajdo commented 1 year ago

Hey - thx @dpguthrie for considering my suggestion ... will try to make it work somehow, but mechanize is also new to me.

the only reason i see why selenium (beside its weight) might be a problem is, that you basically lose the possibility to run your programs using yahooquery in pure terminal mode (like via ssh on a vps) if there is no desktop environment available (since selenium will need to run an instance of the browser which requires a desktop being available)

this problem could also be mitigated if the user has the option to somehow tell yahooquery "here is the path to a config file where you can read out the cookie / crumb" ... like writing a small script where yahooquery gets the cookie and crumb and stores it in ".env" or "cred.json" or something - and the user pushes that file to its vps

just trying to come up with a way where you don't need a desktop and can run your script on a headless raspi for example ;)

edit: tried my luck with noscript addon to figure out how the login process would behave without javascript .... doesn't work - seems like a deadend to me - even if with lots of refactoring it might work like magic - it wouldn't be reliable because every change of yh would need a maintenance

out of ideas currently - but i hope my suggestion to store the cookie/crumb in a plaintext file might be considered so a headless mode would be possible (even if in a dirty workaround way)

KenLee12323 commented 1 year ago

the only reason i see why selenium (beside its weight) might be a problem is, that you basically lose the possibility to run your programs using yahooquery in pure terminal mode (like via ssh on a vps) if there is no desktop environment available (since selenium will need to run an instance of the browser which requires a desktop being available)

Is selenium not possible to run on headless server/ server without GUI/ without desktop? I came across some articles that there is a headless option in webdriver.chrome with code like below:

from selenium import , common option = webdriver.ChromeOptions() option.add_argument(‘headless’) browser = webdriver.Chrome(chrome_options=option)

I don't have experience with selenium before on server so I have no idea about it...

chfiii commented 1 year ago

That sounds like a great idea. It could also be a helper program that could update the cookie/crumb when we get a failure and then retry.

kajdo commented 1 year ago

the only reason i see why selenium (beside its weight) might be a problem is, that you basically lose the possibility to run your programs using yahooquery in pure terminal mode (like via ssh on a vps) if there is no desktop environment available (since selenium will need to run an instance of the browser which requires a desktop being available)

Is selenium not possible to run on headless server/ server without GUI/ without desktop? I came across some articles that there is a headless option in webdriver.chrome with code like below:

from selenium import , common option = webdriver.ChromeOptions() option.add_argument(‘headless’) browser = webdriver.Chrome(chrome_options=option)

I don't have experience with selenium before on server so I have no idea about it...

its possible to run headless - but i think it just "hides the browserwindow" rather than "just run it as a background process" - at least that was my understanding, but I have to confess that i never used it because i somehow always found a way to prevent it

shameful-edit i might be terribly wrong with my assumption: https://intoli.com/blog/running-selenium-with-headless-chrome/ image

dschaefer commented 1 year ago

I believe yahooquery to be the best currently. There was a large contingent of users of yfinance who did a comparison and many of us deemed yahooquery to be the one to focus investment into.

Quickly tried going back to yfinance, and looks like they're getting 401 errors when trying to pull up even the basic ticker info. Looks like Yahoo has made their move, and now it's counteract time....

Seems like yfinance has solved the problem, no? works for me.

dpguthrie commented 1 year ago

Hey all - I've pushed up some of my local changes so you can begin testing. Here are some steps to follow (at least on Mac):

virtualenv .venv
source .venv/bin/activate
pip install https://github.com/dpguthrie/yahooquery/archive/feat/add-selenium.zip

As an aside, this change also allows you to use the quotes endpoint that YF recently put behind crumb/cookies.

yq.Ticker('aapl').quotes
staper1960 commented 1 year ago

Do you know what the equivalent is of source .venv/bin/activate on Windows command prompt? Thanks

dpguthrie commented 1 year ago

Do you know what the equivalent is of source .venv/bin/activate on Windows command prompt? Thanks

I think it's just:

.venv/Scripts/activate

Not entirely a required step, but just good to isolate into a virtual environment.

KenLee12323 commented 1 year ago

venv/bin/activate

seems to be "venv\Scripts\activate"

kajdo commented 1 year ago

@kajdo So cookie could be cached, but what about the crumb?

just because i tried a couple of things out and stumbled upon following

if you have the cookie - you can simply ask for the crumb via GET https://query1.finance.yahoo.com/v1/test/getcrumb

there is no need for a payload, just ensure that you send the cookie as a header ;)