JECSand / yahoofinancials

A powerful financial data module used for pulling data from Yahoo Finance. This module can pull fundamental and technical data for stocks, indexes, currencies, cryptos, ETFs, Mutual Funds, U.S. Treasuries, and commodity futures.
https://pypi.python.org/pypi/yahoofinancials
MIT License
896 stars 214 forks source link

sessions.py _get_crumb does not get the crumb, it's empty #181

Open bjosun opened 5 months ago

bjosun commented 5 months ago

Seems to be a problem with requests and yahoo finance response = session.get('https://query2.finance.yahoo.com/v1/test/getcrumb') response.text is empty. Works in browser where it returns: <html><head><meta name="color-scheme" content="light dark"></head><body><pre style="s: break-word; white-space: pre-wrap;">CRUMB</pre></body></html>

Is this a user agent issue? Been trying different settings without luck.

JECSand commented 5 months ago

@bjosun I’ll check this out over the weekend and see.

bjosun commented 4 months ago

I conducted some testing and encountered an issue with obtaining the cookie from Yahoo using requests. However, I managed to circumvent this issue by manually obtaining the cookie from Chrome and applying it to the request header. This approach worked successfully. Below is the tested code:



url = "https://query2.finance.yahoo.com/v1/test/getcrumb"

# Use the cookies in the request headers for subsequent requests
headers = {
    "Host": "query2.finance.yahoo.com",
    "Cache-Control": "max-age=0",
    "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8",
    "Accept-Encoding": "gzip, deflate, br",
    "Accept-Language": "sv-SE,sv;q=0.7",
    "Cookie": "GUC=AQABCAFloqxl1UIkCwVE&s=AQAAAGO_IrIz&g=ZaFhGQ; A1=d=AQABBAI372QCEKA6MVbUmfHtftUmBPe-bxsFEgABCAGsomXVZfW6b2UB9qMAAAcI_jbvZLbJfag&S=AQAAAhutXgi1W3Zd_OwdQ1bNHAk; A3=d=AQABBAI372QCEKA6MVbUmfHtftUmBPe-bxsFEgABCAGsomXVZfW6b2UB9qMAAAcI_jbvZLbJfag&S=AQAAAhutXgi1W3Zd_OwdQ1bNHAk; A1S=d=AQABBAI372QCEKA6MVbUmfHtftUmBPe-bxsFEgABCAGsomXVZfW6b2UB9qMAAAcI_jbvZLbJfag&S=AQAAAhutXgi1W3Zd_OwdQ1bNHAk; cmp=t=1707377190&j=1&u=1---&v=103; EuConsent=CPxUUEAPxUUEAAOACBSVDfCoAP_AAEfAACiQgoQqoAAgAEAASABQAHAAQgAoACsAFwAZgA2ADgAHoAQABCACSAE4AUAAqgBYAF0AMQAygBoAGsAOAA6gB4AHwAQoAiACOAEmAJgAowBUAFWALYAvwBhAGKAMoAzABogDaAN8AcgBzADwAHoAP0AgACEAEMAIoARgAjgBKACXgE0ATsAowCkgFaAV0AuAC5AGGAMqAaQBqQDiAOSAc4B0ADuAHiAPYAfAA_YCDgIRARABEQCKAEWgIwAjMBHAEdgJKAk0BKQEqAJaATAAmkBNwE4AJ2AT8AooBTQCngFZgK8Ar4BaQC6wF8AX0AwIBhADFAGNgM4AzsBnwGgANFAaYBpwDXgGyANoAbwA4gBzoDqAOqAdsA9AB6gD9AH8AP-AgwBCQCHQEQAImARrAjwCPQEnAJVAToAn8BXwCwwFlALMAWtAtgC2oFugW8AuYBdAC7QF5gL2gYABgIDBAGEAMUgYsBi4DGQGPgMkAZUAywBl4DNAGdgM-gaABoIDTQGtANtAcAA4UBxYDjwHKAOaAdCA6gB2wDzAHuAPfAfOA_cB_YEBQIDgQZAiwBGQCMwEbwI7AR6Ak0BKGCVAJUgSrgleCWUEtAS1AlxBLwEwAJhBBQBMEAEg1KiAJsCAkJhAwigRAiCgIAKBAAAAAQIAAACYIChAGASowGQAgRAAEAAAAABAQAIAAAIAEIAAgCCBAAAAABAAAABAIAAAQAAAAAAAAAAAAAAAAAAAAAACAAhACEEAAIAAIACCgAAgAEAAAAAAAAgBEIAAAAAAAAAAAAAAABAAAAAAAAAAAAAAAAAAABAgAAAAAAAMCAgsAMNABgACIKAiADAAEQUBUAGAAIgoAA",
    "Referer": "https://google.com",
    "Sec-Ch-Ua": "\"Not A(Brand\";v=\"99\", \"Brave\";v=\"121\", \"Chromium\";v=\"121\"",
    "Sec-Ch-Ua-Mobile": "?0",
    "Sec-Ch-Ua-Platform": "\"macOS\"",
    "Sec-Fetch-Dest": "document",
    "Sec-Fetch-Mode": "navigate",
    "Sec-Fetch-Site": "cross-site",
    "Sec-Fetch-User": "?1",
    "Sec-Gpc": "1",
    "Upgrade-Insecure-Requests": "1",
    "User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/121.0.0.0 Safari/537.36"
}

response = requests.get(url, headers=headers)
print(response.text)

response = requests.get(url, headers=headers, stream=True)

print("Request Headers:")
print(response.request.headers)

print("\nResponse Headers:")
print(response.headers)

print("\nStatus Code:", response.status_code)```
bjosun commented 4 months ago

@JECSand Did you have any progress with it?

datatalking commented 4 months ago

@JECSand @bjosun I've run the code provided and get the following response 200, which is good. So if I understand your question it seems to be working.

`Response Headers: {'content-type': 'text/plain;charset=utf-8', 'cache-control': 'private', 'x-frame-options': 'SAMEORIGIN', 'x-envoy-upstream-service-time': '2', 'date': 'Tue, 27 Feb 2024 10:54:06 GMT', 'server': 'ATS', 'x-envoy-decorator-operation': 'finance-yql--mtls-default-production-gq1.finance-k8s.svc.yahoo.local:4080/*', 'Age': '0', 'Strict-Transport-Security': 'max-age=31536000', 'Referrer-Policy': 'no-referrer-when-downgrade', 'Transfer-Encoding': 'chunked', 'Connection': 'keep-alive', 'Expect-CT': 'max-age=31536000, report-uri="http://csp.yahoo.com/beacon/csp?src=yahoocom-expect-ct-report-only"', 'X-XSS-Protection': '1; mode=block', 'X-Content-Type-Options': 'nosniff'}

Status Code: 200`

bjosun commented 4 months ago

Correct, the snippet works. But i can't get yahoofinancials to get the yahoo finance cookie. Which makes possible to get the crumb. Yahoofinancials get_key_statistics_data() only works when I get the cookie from a browser.

datatalking commented 4 months ago

@bjosun can you somehow google that or ask chatgpt to suggest alternate routes, I don't know if response handling will work but hope you figure it out. Let us know