Open KostyaCholak opened 2 years ago
So I've just tested investiny
and it seems to be working fine again... I assume their Cloudflare has some limitations but it's not blacklisting every IP forever, just after a certain number of requests...
Look:
(investiny-py3.9) alvarobartt@Alvaros-MacBook-Air investiny % poetry run python
Python 3.9.6 (default, Aug 5 2022, 15:21:02)
[Clang 14.0.0 (clang-1400.0.29.102)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from investiny import historical_data
>>> d = historical_data(investing_id=6408)
>>> d
{'date': ['09/26/2022', '09/27/2022', '09/28/2022', '09/29/2022', '09/30/2022', '10/03/2022', '10/04/2022', '10/05/2022', '10/06/2022', '10/07/2022', '10/10/2022', '10/11/2022', '10/12/2022', '10/13/2022', '10/14/2022', '10/17/2022', '10/18/2022', '10/19/2022', '10/20/2022', '10/21/2022'], 'open': [149.66000366211, 152.74000549316, 147.63999938965, 146.10000610352, 141.2799987793, 138.21000671387, 145.0299987793, 144.07499694824, 145.80999755859, 142.53999328613, 140.41999816895, 139.89999389648, 139.13000488281, 134.99000549316, 144.30999755859, 141.06500244141, 145.49000549316, 141.69000244141, 143.02000427246, 142.96000671387], 'high': [153.7700958252, 154.7200012207, 150.64140319824, 146.7200012207, 143.10000610352, 143.07000732422, 146.2200012207, 147.38000488281, 147.53999328613, 143.10000610352, 141.88999938965, 141.35000610352, 140.36000061035, 143.58999633789, 144.52000427246, 142.89999389648, 146.69999694824, 144.94920349121, 145.88999938965, 147.83999633789], 'low': [149.63999938965, 149.94500732422, 144.83999633789, 140.67999267578, 138, 137.68499755859, 144.25999450684, 143.00999450684, 145.2200012207, 139.44500732422, 138.57290649414, 138.2200012207, 138.16000366211, 134.36999511719, 138.19000244141, 140.27000427246, 140.61000061035, 141.5, 142.64999389648, 142.67999267578], 'close': [150.77000427246, 151.75999450684, 149.83999633789, 142.47999572754, 138.19999694824, 142.44999694824, 146.10000610352, 146.39999389648, 145.42999267578, 140.08999633789, 140.41999816895, 138.97999572754, 138.33999633789, 142.99000549316, 138.38000488281, 142.41000366211, 143.75, 143.86000061035, 143.38999938965, 147.27000427246], 'volume': [93339000, 84443000, 146691008, 128138000, 124925000, 114312000, 87134000, 79148000, 68402000, 85926000, 74591000, 77034000, 69833000, 112876000, 88237000, 84684000, 98716000, 61758000, 64277000, 85641896]}
So what is the problem with this solution @alvarobartt ? It works for me without any limitations from Cloudflare. Did it stop working for you after some time?
Seems like a reliable solution to me
So what is the problem with this solution @alvarobartt ? It works for me without any limitations from Cloudflare. Did it stop working for you after some time?
Seems like a reliable solution to me
Not at all, I was about to test it whenever I realized that investiny
was working fine with the current solution!
Is the current implementation not working from your side? Can you run some stress tests over Investing.com to see whether you end up getting HTTP 403 or not? Thanks 😄
Oops, maybe the invalid label was confusing @KostyaCholak, I meant that it was related to something not valid e.g. the current implementation, not that your solution was not valid 👍🏻 I've updated the label to be more clear!
>>> from investiny import historical_data
>>> d = historical_data(investing_id=6408)
>>> d
ConnectionError: Request to Investing.com API failed with error code: 403.
@alvarobartt, no need to stress tests: I got it at first attempt
The only reliable solution for now seems the one posted here: https://github.com/alvarobartt/investpy/issues/611#issuecomment-1284571443
So your solution failed too? Or just the default investiny
?
That seems to be a solution, yes, but I prefer to wait until I get a response from Investing.com, as I want to approach this the best way possible, but thanks for mentioning it again :smile:
@KostyaCholak see this, launched right now, and working fine:
So maybe you are blocked or something, because it's working fine for me... both using investiny
and plain httpx
as shown in the screenshot above.
P.S. I'll be attaching the code in the Jupyter Notebook here so that you can reproduce it!
import httpx
headers = {
"Content-Type": "application/json",
"Origin": "https://tvc-invdn-com.investing.com",
"Host": "tvc4.investing.com",
"User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/16.0 Safari/605.1.15",
"Referer": "https://tvc-invdn-com.investing.com/",
"Connection": "keep-alive",
}
url = "https://tvc4.investing.com/8f72bb4be70f8f06f5bad539977ee7ce/1666473162/1/1/8/history?symbol=6408&resolution=30&from=1663881169&to=1666473229"
r = httpx.get(url, headers=headers)
print(r)
print(r.json())
And same thing if I run investiny
's unit tests with poetry run make tests
Oops, maybe the invalid label was confusing @KostyaCholak, I meant that it was related to something not valid e.g. the current implementation, not that your solution was not valid 👍🏻 I've updated the label to be more clear!
yes, thanks)
Just tried latest version and result is the same. Should I try master branch? But the urllib version works somehow.
Also tried httpx, got 403
@KostyaCholak okay, let me stress test it so that I also get HTTP 403 so I can reproduce it, then I'll tell you! Also, could you paste them here or send me the headers you're using for the request via email? As copy-paste from the browser won't work if that cannot be automated :weary:
headers = {
'authority': 'sbcharts.investing.com',
'accept': 'application/json, text/javascript, */*; q=0.01',
'cache-control': 'no-cache',
'origin': 'https://www.investing.com',
'pragma': 'no-cache',
'referer': 'https://www.investing.com/',
'sec-ch-ua': '"Google Chrome";v="107", "Chromium";v="107", "Not=A?Brand";v="24"',
'sec-ch-ua-mobile': '?0',
'sec-ch-ua-platform': '"macOS"',
'sec-fetch-dest': 'empty',
'sec-fetch-mode': 'cors',
'sec-fetch-site': 'same-site',
'user-agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/107.0.0.0 Safari/537.36',
}
Hi, @alvarobartt ! I've encountered the 403 Error problem today and found that using curl seem to be working fine, no 403 error. And the only difference I can see is the headers ordering - requests shuffles headers, while curl preserves them as provided. So I tried using urllib.request and it worked.
I'm using Python 3.10.5
Maybe this can solve all 403 errors in the project?
minimal working example: