alvarobartt / investiny

🤏🏻 `investpy` but made tiny
https://alvarobartt.github.io/investiny
MIT License
280 stars 32 forks source link

Possible way to fix 403 Error #62

Open KostyaCholak opened 1 year ago

KostyaCholak commented 1 year ago

Hi, @alvarobartt ! I've encountered the 403 Error problem today and found that using curl seem to be working fine, no 403 error. And the only difference I can see is the headers ordering - requests shuffles headers, while curl preserves them as provided. So I tried using urllib.request and it worked.

I'm using Python 3.10.5

Maybe this can solve all 403 errors in the project?

minimal working example:

import urllib.request

# take them from your browser, no cookies required
headers = {}

req = urllib.request.Request(f'https://sbcharts.investing.com/events_charts/us/222.json', b"", headers)
with urllib.request.urlopen(req) as response:
    response = response.read().decode()
RyuuOujiXS commented 1 year ago

56

alvarobartt commented 1 year ago

So I've just tested investiny and it seems to be working fine again... I assume their Cloudflare has some limitations but it's not blacklisting every IP forever, just after a certain number of requests...

Look:

(investiny-py3.9) alvarobartt@Alvaros-MacBook-Air investiny % poetry run python
Python 3.9.6 (default, Aug  5 2022, 15:21:02) 
[Clang 14.0.0 (clang-1400.0.29.102)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from investiny import historical_data
>>> d = historical_data(investing_id=6408)
>>> d
{'date': ['09/26/2022', '09/27/2022', '09/28/2022', '09/29/2022', '09/30/2022', '10/03/2022', '10/04/2022', '10/05/2022', '10/06/2022', '10/07/2022', '10/10/2022', '10/11/2022', '10/12/2022', '10/13/2022', '10/14/2022', '10/17/2022', '10/18/2022', '10/19/2022', '10/20/2022', '10/21/2022'], 'open': [149.66000366211, 152.74000549316, 147.63999938965, 146.10000610352, 141.2799987793, 138.21000671387, 145.0299987793, 144.07499694824, 145.80999755859, 142.53999328613, 140.41999816895, 139.89999389648, 139.13000488281, 134.99000549316, 144.30999755859, 141.06500244141, 145.49000549316, 141.69000244141, 143.02000427246, 142.96000671387], 'high': [153.7700958252, 154.7200012207, 150.64140319824, 146.7200012207, 143.10000610352, 143.07000732422, 146.2200012207, 147.38000488281, 147.53999328613, 143.10000610352, 141.88999938965, 141.35000610352, 140.36000061035, 143.58999633789, 144.52000427246, 142.89999389648, 146.69999694824, 144.94920349121, 145.88999938965, 147.83999633789], 'low': [149.63999938965, 149.94500732422, 144.83999633789, 140.67999267578, 138, 137.68499755859, 144.25999450684, 143.00999450684, 145.2200012207, 139.44500732422, 138.57290649414, 138.2200012207, 138.16000366211, 134.36999511719, 138.19000244141, 140.27000427246, 140.61000061035, 141.5, 142.64999389648, 142.67999267578], 'close': [150.77000427246, 151.75999450684, 149.83999633789, 142.47999572754, 138.19999694824, 142.44999694824, 146.10000610352, 146.39999389648, 145.42999267578, 140.08999633789, 140.41999816895, 138.97999572754, 138.33999633789, 142.99000549316, 138.38000488281, 142.41000366211, 143.75, 143.86000061035, 143.38999938965, 147.27000427246], 'volume': [93339000, 84443000, 146691008, 128138000, 124925000, 114312000, 87134000, 79148000, 68402000, 85926000, 74591000, 77034000, 69833000, 112876000, 88237000, 84684000, 98716000, 61758000, 64277000, 85641896]}
KostyaCholak commented 1 year ago

So what is the problem with this solution @alvarobartt ? It works for me without any limitations from Cloudflare. Did it stop working for you after some time?

Seems like a reliable solution to me

alvarobartt commented 1 year ago

So what is the problem with this solution @alvarobartt ? It works for me without any limitations from Cloudflare. Did it stop working for you after some time?

Seems like a reliable solution to me

Not at all, I was about to test it whenever I realized that investiny was working fine with the current solution!

Is the current implementation not working from your side? Can you run some stress tests over Investing.com to see whether you end up getting HTTP 403 or not? Thanks 😄

alvarobartt commented 1 year ago

Oops, maybe the invalid label was confusing @KostyaCholak, I meant that it was related to something not valid e.g. the current implementation, not that your solution was not valid 👍🏻 I've updated the label to be more clear!

gbonariva commented 1 year ago
 >>> from investiny import historical_data
 >>> d = historical_data(investing_id=6408)
 >>> d

ConnectionError: Request to Investing.com API failed with error code: 403.

@alvarobartt, no need to stress tests: I got it at first attempt

The only reliable solution for now seems the one posted here: https://github.com/alvarobartt/investpy/issues/611#issuecomment-1284571443

alvarobartt commented 1 year ago

So your solution failed too? Or just the default investiny?

That seems to be a solution, yes, but I prefer to wait until I get a response from Investing.com, as I want to approach this the best way possible, but thanks for mentioning it again :smile:

alvarobartt commented 1 year ago

@KostyaCholak see this, launched right now, and working fine:

Screenshot 2022-10-24 at 20 26 13

So maybe you are blocked or something, because it's working fine for me... both using investiny and plain httpx as shown in the screenshot above.

P.S. I'll be attaching the code in the Jupyter Notebook here so that you can reproduce it!

import httpx

headers = {
    "Content-Type": "application/json",
    "Origin": "https://tvc-invdn-com.investing.com",
    "Host": "tvc4.investing.com",
    "User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/16.0 Safari/605.1.15",
    "Referer": "https://tvc-invdn-com.investing.com/",
    "Connection": "keep-alive",
}

url = "https://tvc4.investing.com/8f72bb4be70f8f06f5bad539977ee7ce/1666473162/1/1/8/history?symbol=6408&resolution=30&from=1663881169&to=1666473229"

r = httpx.get(url, headers=headers)
print(r)
print(r.json())
alvarobartt commented 1 year ago

And same thing if I run investiny's unit tests with poetry run make tests

image
KostyaCholak commented 1 year ago

Oops, maybe the invalid label was confusing @KostyaCholak, I meant that it was related to something not valid e.g. the current implementation, not that your solution was not valid 👍🏻 I've updated the label to be more clear!

yes, thanks)

KostyaCholak commented 1 year ago

Just tried latest version and result is the same. Should I try master branch? But the urllib version works somehow.

Screenshot 2022-11-08 at 21 19 29 Screenshot 2022-11-08 at 21 21 59
KostyaCholak commented 1 year ago

Also tried httpx, got 403

Screenshot 2022-11-08 at 23 48 02
alvarobartt commented 1 year ago

@KostyaCholak okay, let me stress test it so that I also get HTTP 403 so I can reproduce it, then I'll tell you! Also, could you paste them here or send me the headers you're using for the request via email? As copy-paste from the browser won't work if that cannot be automated :weary:

KostyaCholak commented 1 year ago
headers = {    
    'authority': 'sbcharts.investing.com',
    'accept': 'application/json, text/javascript, */*; q=0.01',
    'cache-control': 'no-cache',
    'origin': 'https://www.investing.com',
    'pragma': 'no-cache',
    'referer': 'https://www.investing.com/',
    'sec-ch-ua': '"Google Chrome";v="107", "Chromium";v="107", "Not=A?Brand";v="24"',
    'sec-ch-ua-mobile': '?0',
    'sec-ch-ua-platform': '"macOS"',
    'sec-fetch-dest': 'empty',
    'sec-fetch-mode': 'cors',
    'sec-fetch-site': 'same-site',
    'user-agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/107.0.0.0 Safari/537.36',
}