alvarobartt / investpy

Financial Data Extraction from Investing.com with Python
https://investpy.readthedocs.io/
MIT License
1.59k stars 374 forks source link

ConnectionError: ERR#0015: error 403, try again later. #600

Open divyankm opened 1 year ago

divyankm commented 1 year ago

Code-

import investpy

df = investpy.get_stock_historical_data(stock='AAPL',
                                        country='United States',
                                        from_date='01/01/2010',
                                        to_date='01/01/2020')
print(df.head())

Error-

ConnectionError                           Traceback (most recent call last)
[<ipython-input-4-f6f4235b7e47>](https://localhost:8080/#) in <module>
      4                                         country='United States',
      5                                         from_date='01/01/2010',
----> 6                                         to_date='01/01/2020')
      7 print(df.head())

[/usr/local/lib/python3.7/dist-packages/investpy/stocks.py](https://localhost:8080/#) in get_stock_historical_data(stock, country, from_date, to_date, as_json, order, interval)
    663         if req.status_code != 200:
    664             raise ConnectionError(
--> 665                 "ERR#0015: error " + str(req.status_code) + ", try again later."
    666             )
    667 

ConnectionError: ERR#0015: error 403, try again later.
mobinzk commented 1 year ago

I've the same issue, from my investigation https://www.investing.com/instruments/HistoricalDataAjax seems to be discontinued!

PatiPimenta commented 1 year ago

I've the same issue.

campernate commented 1 year ago

Same issue, any update on this?

Sebara0 commented 1 year ago

I have the same accion = 'AAPL' search_result = investpy.search_quotes(text=accion, products=['stocks'], countries=['united states'], n_results=1)

File "D:\Python39\lib\site-packages\investpy\search.py", line 127, in search_quotes raise ConnectionError(f"ERR#0015: error {req.status_code}, try again later.") ConnectionError: ERR#0015: error 403, try again later.

lstavares84 commented 1 year ago

Same here.

/usr/local/lib/python3.7/dist-packages/investpy/stocks.py in get_stock_historical_data(stock, country, from_date, to_date, as_json, order, interval) 663 if req.status_code != 200: 664 raise ConnectionError( --> 665 "ERR#0015: error " + str(req.status_code) + ", try again later." 666 ) 667

ConnectionError: ERR#0015: error 403, try again later.

cypresswang commented 1 year ago

Same here. Did some research and it seems error 403 indicates that the server understands the request but refuses to authorize it.

nicklatin commented 1 year ago

You can still pull the economic calendar but everything else seems to be returning a 403.

drnarc commented 1 year ago

Also confirming get_index_historical_data yields "ConnectionError: ERR#0015: error 403, try again later" (tried multiple US symbols) but econ calendar still works. Was working yesterday

alvarobartt commented 1 year ago

Hi everyone! I've checked this issue and it seems that the internal API that Investing.com uses has changed without prior notice, as this is not an official implementation. Sorry for the inconvenience, this issue will be solved in the upcoming days, hopefully, today I'll push a patch to solve this, I'll share later on with you the branch where I'm working actively to solve this!

alvarobartt commented 1 year ago

Both get_stock_recent_data and get_stock_historical_data are fixed in https://github.com/alvarobartt/investpy/tree/403-patch, I'll add the comments of the fix in an upcoming PR! In the meantime, you can install the latest investpy version from 403-patch branch as pip install git+https://github.com/alvarobartt/investpy@403-patch

alvarobartt commented 1 year ago

Hi again everyone, feel free to track the progress of the patch at https://github.com/alvarobartt/investpy/pull/602 :hugs: I'd also appreciate some feedback from the ones testing it! So drop your feedback either here or in https://twitter.com/alvarobartt/status/1570661023262310402

Development-Platforms commented 1 year ago

Commodity historical data not working

investpy.get_commodity_historical_data

ConnectionError: ERR#0015: error 403, try again later.

divyankm commented 1 year ago

Hi @alvarobartt , get_historcal_data is working. get_index_historical_data is not working. `` Code-

df = investpy.get_index_historical_data(index="S&P 500",country="United States",from_date="01/01/2010",to_date="01/01/2022")
df

Error-

---------------------------------------------------------------------------
ConnectionError                           Traceback (most recent call last)
[<ipython-input-31-5e1d4758da05>](https://localhost:8080/#) in <module>
----> 1 df = investpy.get_index_historical_data(index="S&P 500",country="United States",from_date="01/01/2010",to_date="01/01/2022")
      2 df

[/usr/local/lib/python3.7/dist-packages/investpy/indices.py](https://localhost:8080/#) in get_index_historical_data(index, country, from_date, to_date, as_json, order, interval)
    648         if req.status_code != 200:
    649             raise ConnectionError(
--> 650                 "ERR#0015: error " + str(req.status_code) + ", try again later."
    651             )
    652 

ConnectionError: ERR#0015: error 403, try again later.

Really Appreciate your efforts. Thank you so much.

MetalComm commented 1 year ago

Both get_stock_recent_data and get_stock_historical_data are fixed in https://github.com/alvarobartt/investpy/tree/403-patch, I'll add the comments of the fix in an upcoming PR! In the meantime, you can install the latest investpy version from 403-patch branch as pip install git+https://github.com/alvarobartt/investpy@403-patch

Could anyone help me on how to install the patch? Python gives me an error. Thank you!

divyankm commented 1 year ago

Could anyone help me on how to install the patch? Python gives me an error. Thank you!

@MetalComm Which IDE you are using, For Colab IDE Use >> !pip install git+https://github.com/alvarobartt/investpy@403-patch For other IDE, check with the pip version or try using pip3 for python version 3+

MetalComm commented 1 year ago

pip3

thank you, but... is the URL correct? Beacause I get this error: Could not install requirement https://github.com/alvarobartt/investpy@403-patch because of HTTP error 404 Client Error: Not Found for url: https://github.com/alvarobartt/investpy@403-patch for URL https://github.com/alvarobartt/investpy@403-patch

srinivasakumar-a commented 1 year ago

Both get_stock_recent_data and get_stock_historical_data are fixed in https://github.com/alvarobartt/investpy/tree/403-patch, I'll add the comments of the fix in an upcoming PR! In the meantime, you can install the latest investpy version from 403-patch branch as pip install git+https://github.com/alvarobartt/investpy@403-patch

Did installed the patch and tried to run my programs, but still getting the same ConnectionError: ERR#0015: error 403, try again later.

BTW, I'm using the below:

investpy.get_index_historical_data()
investpy.search_quotes()
investpy.get_stock_recent_data()
alvarobartt commented 1 year ago

Soooo it seems that it was working like 1 hour away or so and now suddenly stopped working again... So I'll keep on investigating it... It works from the browser and also from Postman, Thunder Client, and similar, but from Python it seems that it doesn't work now...

Exganza commented 1 year ago

indices method still not work

df = investpy.indices.get_index_historical_data

alvarobartt commented 1 year ago

Hi @Exganza so the fix is pending, since stocks stopped working too in the 403-patch branch for no reason (already using the new Investing.com API), so I got to fix it but then stopped working, I'm actively checking it! I'll let you all know whenever I have more updates, sorry for the inconvenience!

alvarobartt commented 1 year ago

It seems that after a certain number of requests Cloudflare blocks you... So it's not stable...

Exganza commented 1 year ago

Hi @Exganza so the fix is pending, since stocks stopped working too in the 403-patch branch for no reason (already using the new Investing.com API), so I got to fix it but then stopped working, I'm actively checking it! I'll let you all know whenever I have more updates, sorry for the inconvenience!

thank you @alvarobartt

nhlsm commented 1 year ago

I found incomplete solution using cloudscraper module. ( NOTE!, incomplete solution ) This module bypass cloudeflare.

  1. [OK] https://www.investing.com
    
    url = 'https://www.investing.com'

################################### import requests

req = requests.get(url) print(req) # <Response [403]>

################################### import cloudscraper

scraper = cloudscraper.create_scraper() # returns a CloudScraper instance

ret = scraper.get(url) print(ret) # <Response [200]>


2. [NOT OK] https://api.investing.com/api/financialdata/historical/43365?start-date=2022-08-19&end-date=2022-09-17&time-frame=Daily&add-missing-rows=false

url = 'https://api.investing.com/api/financialdata/historical/43365?start-date=2022-08-19&end-date=2022-09-17&time-frame=Daily&add-missing-rows=false'

################################### import requests

req = requests.get(url) print(req) # <Response [403]>

################################### import cloudscraper

scraper = cloudscraper.create_scraper() # returns a CloudScraper instance

ret = scraper.get(url)
print(ret) # exception. "Cloudflare version 2 Captcha challenge" ''' Traceback (most recent call last): cloudscraper.exceptions.CloudflareChallengeError: Detected a Cloudflare version 2 Captcha challenge, This feature is not available in the opensource (free) version. '''



additional reference
https://splunktool.com/cloudscraperexceptionscloudflarechallengeerror-detected-a-cloudflare-version-2-challenge-error-when-i-used-cloudscraper-module-with-python
younggotti commented 1 year ago

I'm a newbie so I'm probably missing something but scraping from the following url seems to work still fine (I don't know how long it will last) https://advcharts.investing.com/advinion2016/advanced-charts/9/9/16/GetRecentHistory?strSymbol=46891&iTop=1500&strPriceType=bid&strFieldsMode=allFields&lang_ID=9&strTimeFrame=1D

Exganza commented 1 year ago

I'm a newbie so I'm probably missing something but scraping from the following url seems to work still fine (I don't know how long it will last) https://advcharts.investing.com/advinion2016/advanced-charts/9/9/16/GetRecentHistory?strSymbol=46891&iTop=1500&strPriceType=bid&strFieldsMode=allFields&lang_ID=9&strTimeFrame=1D

Nice @younggotti : D they forgot to burry this one, it can be a temporary solution

Sebara0 commented 1 year ago

We need a ID and Description tables to match each symbol, cause the routine search-quotes doesn't work, yet.

Exganza commented 1 year ago

its all available in the investpy resources folder -> Lib\site-packages\investpy\resources ID col in stocks.csv

gemo911 commented 1 year ago

Hello,

I've got the same error with \investpy\bonds.py:596 in get_bond_historical_data raise ConnectionError(

ConnectionError: ERR#0015: error 403, try again later.

Sebara0 commented 1 year ago

its all available in the investpy resources folder -> Lib\site-packages\investpy\resources ID col in stocks.csv

Thank you very much. It's completaly useful. We are available to replace the "search_quotes" routine, and access by each asset "investing.com" codes. It's a great news for me

woreom commented 1 year ago

I its not working, plz fix it or show us how to fix it

aullate commented 1 year ago

Hi! Same issue for investpy.funds.get_fund_historical_data You are doing a great job for the investment community.

ymyke commented 1 year ago

@Exganza @Sebara0 What do you do with the id once you have it? How do you use it to access information?

Exganza commented 1 year ago

@Exganza @Sebara0 What do you do with the id once you have it? How do you use it to access information?

By passing id as a parameter to Symbol https://advcharts.investing.com/advinion2016/advanced-charts/9/9/16/GetRecentHistory?strSymbol=46891&iTop=1500&strPriceType=bid&strFieldsMode=allFields&lang_ID=9&strTimeFrame=1D

But it doesn't matter, today its stop working. Cloudflare still kicking you out after many requests.

ymyke commented 1 year ago

Thanks, @Exganza. – Looks like we lost investpy then. 😞😞

Sebara0 commented 1 year ago

In my case, it's useful, for instance:

ident = +"stocks.xlsx" read_tag = pd.read_excel(ident) .... tag2 = read_tag[read_tag['country'] == 'united states'] tag3 = tag2.loc[tag2['symbol'] == 'TSLA', 'tag'] tag = tag3.values[0]

ex: https://www.investing.com/equities/tesla-motors-ratios

url = 'https://www.investing.com/equities/'+tag+'-ratios' driver.get(url)

get Price to Cash Flow

find_rating = driver.find_element(By.XPATH, "(//tr[@id='childTr']/td/div/table/tbody/tr[3]/td[2])") p_cf = find_rating.text

plutoBase commented 1 year ago

Alvaro, am sure you are very busy working on all users' behalf, but is there any chance you could update us on your hoped for progress?

Many thanks!

alvarobartt commented 1 year ago

Hi guys, so there's not much progress from my side, as it seems that the Cloudflare protection cannot be ignored, so the patch just works for a limited number of requests... But ends up failing every time!

maread99 commented 1 year ago

Hi @alvarobartt, I don't want to overstep the mark, although given the above and your latest pretty pessimistic comment on #602, would you mind me suggesting here an alternative library for price data that might be a worthy option whilst investpy is out-of-action? The library's concerned with price processing and providing a useful query interface - it depends on other libraries to fetch the data and I'd been hoping to include investpy as a source. I very much hope you're able to get it back up and stable - the community would be much poorer without it.

Cheers

nhlsm commented 1 year ago

I guess that one possible solution is using origin server ip adress of 'api.investing.com' instead of cloudflare ip address.

reference https://www.youtube.com/watch?v=Cfbi5-Knpxk https://blog.detectify.com/2019/07/31/bypassing-cloudflare-waf-with-the-origin-server-ip-address/

But, I can't found origin server ip address of 'api.investing.com'. ( 'CloudFail' can't find it. )

Can anyone do it?

MaximKorobov commented 1 year ago

@alvarobartt, when do you plan to release minor version with patch included?

apoorvsingh090 commented 1 year ago

Any alternate libraries to fetch data from investing.com?

samjmck commented 1 year ago

It might be worth using a MITM proxy on an Android or iOS device to check the requests the mobile app is making to the server. Those might be different to the web app and might not have the same protections.

samjmck commented 1 year ago

I've found some endpoints in their mobile app which seem to work in cURL without any special JavaScript or session cookies, while their main endpoints in their web app seem to be blocked for me.

Using the search endpoint, we can get the pair_ID for a security. For example, this is for Cloudflare stock:

curl "https://iappapi.investing.com/search.php?string=Cloudflare" -H 'X-Meta-Ver: 14'

This gives the following JSON response:

{
  "data": {
    "pairs_attr": [
      {
        "pair_ID": 1152334,
        "search_main_text": "NET",
        "search_main_longtext": "Cloudflare Inc",
        "exchange_flag_ci": 5,
        "search_main_subtext": "Equity - NYSE"
      }
    ]
  },
  "system": {
    "message_action": "force_update_app",
    "link": ""
  },
  "ip": "__",
  "zmq": [
    "https:\/\/streaming.forexpros.com:443\/echo\/websocket",
    "https:\/\/stream92.forexpros.com:443\/echo\/websocket"
  ],
  "zmq_col": "1",
  "ittl": "4",
  "error": {
    "debug": "",
    "display_message": ""
  },
  "ccode": "BE"
}

Using the pair_ID, we can get the historical pricing data with the following cURL command:

curl "https://iappapi.investing.com/get_screen.php?lang_ID=51&skinID=2&interval=day&time_utc_offset=7200&screen_ID=63&pair_ID=1152334&date_to=21092022&date_from=22082022" \
     -H 'X-Meta-Ver: 14'

This gives the following response:

{
  "data": [
    {
      "screen_ID": "63",
      "screen_data": {
        "columns": {
          "date": "Date",
          "price": "Price",
          "open": "Open",
          "high": "High",
          "low": "Low",
          "vol": "Vol.",
          "perc_chg": "Chg. %"
        },
        "columns_order": [
          "date",
          "price",
          "open",
          "high",
          "low",
          "vol",
          "perc_chg"
        ],
        "data": [
          {
            "date": 1663632000,
            "price": "61.13",
            "open": "60.54",
            "high": "62.05",
            "low": "59.78",
            "vol": "4.64M",
            "perc_chg": "-0.11%",
            "color": "#fa4545"
          },
          {
            "date": 1663545600,
            "price": "61.20",
            "open": "58.92",
            "high": "61.41",
            "low": "58.79",
            "vol": "2.99M",
            "perc_chg": "3.87%",
            "color": "#3fc932"
          },
          {
            "date": 1663286400,
            "price": "58.92",
            "open": "59.50",
            "high": "59.67",
            "low": "57.36",
            "vol": "12.78M",
            "perc_chg": "-3.33%",
            "color": "#fa4545"
          },
          ...
      ]
}
MaximKorobov commented 1 year ago

Any alternate libraries to fetch data from investing.com?

5 years ago I started to build investing site scrapper. One can use it within Apache 2.0 license.

Hyperz commented 1 year ago

Hi. I don't personally use this library, but a Discord bot I have for one my clients also consumes data from api.investing.com so I'm faced with the same issue. I've done quite a lot of testing and I've come to the conclusion that it's the TLS fingerprint that triggers the Cloudflare challenge/403. Now, usually you can get around it by changing the TLS cipher lists/suites etc such that you don't have a known bad/blacklisted fingerprint. But I think they are using the CF's Enterprise plan or something because just changing the fingerprint doesn't work. Meaning they are probably going by a TLS fingerprint whitelist or something. The solution would be to exactly match the TLS fingerprint of a web browser, but I don't think that's possible in Python because you don't have access to all the low-level SSL library internals. I know investpy uses requests instead of aiohttp, but I'm gonna post my (unsuccessful thus far) attempt at getting around it in case someone has any ideas. The same concept can be applied to a requests session as well (ie through mounting a custom HTTPAdapter):

class HTTPSession(aiohttp.ClientSession):
    cipher_suites = ':'.join([
        'TLS_CHACHA20_POLY1305_SHA256',
        'TLS_AES_128_GCM_SHA256',
        'TLS_AES_256_GCM_SHA384',
    ])
    cipher_list = ':'.join([
        'ECDHE-ECDSA-CHACHA20-POLY1305',
        'ECDHE-RSA-CHACHA20-POLY1305',
        'ECDHE-ECDSA-AES128-GCM-SHA256',
        'ECDHE-RSA-AES128-GCM-SHA256',
        'ECDHE-ECDSA-AES256-GCM-SHA384',
        'ECDHE-RSA-AES256-GCM-SHA384',
        'ECDHE-ECDSA-AES128-SHA',
        'ECDHE-RSA-AES128-SHA',
        'ECDHE-ECDSA-AES256-SHA',
        'ECDHE-RSA-AES256-SHA',
        'AES128-GCM-SHA256',
        'AES256-GCM-SHA384',
        'AES128-SHA',
        'AES256-SHA',
        'DES-CBC3-SHA',
    ])
    signature_algorithms = ':'.join([
        'ecdsa_secp256r1_sha256',
        'rsa_pss_rsae_sha256',
        'rsa_pkcs1_sha256',
        'ecdsa_secp384r1_sha384',
        'rsa_pss_rsae_sha384',
        'rsa_pkcs1_sha384',
        'rsa_pss_rsae_sha512',
        'rsa_pkcs1_sha512',
        'rsa_pkcs1_sha1',
    ])

    def __init__(self, *args: Any, **kwargs: Any) -> None:
        flaresolverr_enabled = kwargs.pop('flaresolverr_enabled', False)
        flaresolverr_url = kwargs.pop('flaresolverr_url', 'http://localhost:8191/v1')
        connector = kwargs.get('connector')

        if connector is None:
            context = self._create_custom_ssl_context()
            connector = aiohttp.TCPConnector(ssl=context)
            kwargs['connector'] = connector

        super().__init__(*args, **kwargs)

        self.log = app.get_logger()
        self.flaresolverr_enabled: bool = flaresolverr_enabled
        self.flaresolverr_url: str = flaresolverr_url

    @classmethod
    def _create_custom_ssl_context(cls) -> ssl.SSLContext:
        context = ssl.create_default_context()
        address = id(context) + sys.getsizeof(object())
        context_address = ctypes.cast(address, ctypes.POINTER(ctypes.c_void_p)).contents

        if sys.platform.startswith('win32'):
            libssl = ctypes.CDLL('libssl-1_1.dll')
        elif sys.platform.startswith(('linux', 'darwin')):
            libssl = ctypes.CDLL(ssl._ssl.__file__)
        else:
            raise NotImplementedError('Unsupported OS.')

        with warnings.catch_warnings():
            warnings.filterwarnings('ignore', category=DeprecationWarning)
            context.minimum_version = ssl.TLSVersion.TLSv1

        context.set_alpn_protocols(['http/1.1'])
        context.options |= 1 << 19
        libssl.SSL_CTX_set_ciphersuites(context_address, cls.cipher_suites.encode())
        libssl.SSL_CTX_set_cipher_list(context_address, cls.cipher_list.encode())
        libssl.SSL_CTX_ctrl(context_address, 98, 0, cls.signature_algorithms.encode())

        return context

    @staticmethod
    async def _is_challenge_response(resp: aiohttp.ClientResponse) -> bool:
        server = resp.headers.get('server', '').lower()
        html_src = await resp.text()

        if server.startswith('cloudflare'):
            strings = ('Ray ID: <code>', 'window._cf_chl_opt', '/cdn-cgi/challenge-platform/')

            if resp.status in (403, 503) and all(s in html_src for s in strings):
                return True

        if server.startswith('ddos-guard') and not resp.ok:
            return 'check.ddos-guard.net/check.js' in html_src

        return False

    async def _get_challenge_solution(self, url: str) -> Dict[str, Any]:
        query = {
            'cmd': 'request.get',
            'url': url,
            'maxTimeout': self.timeout.total * 1000,
        }
        resp = await self.post(self.flaresolverr_url, json=query)
        resp.raise_for_status()
        result = await resp.json(content_type=None)
        solution = result['solution']

        return solution

    async def _request(self, method: str, str_or_url: StrOrURL, **kwargs: Any) -> aiohttp.ClientResponse:
        resp = await super()._request(method, str_or_url, **kwargs)

        if self.flaresolverr_enabled:
            is_challenge = await self._is_challenge_response(resp)

            if is_challenge:
                self.log.debug('Anti-bot challenge detected! Attempting to solve...')
                solution = await self._get_challenge_solution(str(str_or_url))
                cookies = SimpleCookie()

                for cookie in solution['cookies']:
                    self.log.debug('Setting cookie: %s.', cookie)
                    cookies[cookie['name']] = cookie['value']
                    cookies[cookie['name']]['domain'] = cookie['domain']
                    cookies[cookie['name']]['path'] = cookie['path']

                self.cookie_jar.update_cookies(cookies)
                self.headers.update({'User-Agent': solution['userAgent']})
                self.log.debug('Repeating request with challenge solution...')

                resp = await super()._request(method, str_or_url, **kwargs)
                is_challenge = await self._is_challenge_response(resp)

                if is_challenge:
                    self.log.error('Still getting an anti-bot challenge...')
                else:
                    self.log.debug('Anti-bot challenge successfully solved!')

        return resp

Ignore the flaresolverr logic, even with the solution cookies it still triggers the 403 when repeating the request. The relevant bit is the the custom SSL context.

Larimarinho commented 1 year ago

Hi guys, I'm having the same problem, does anyone have any tips to get around this situation?

dados_usd_brl = investpy.currency_crosses.get_currency_cross_historical_data(currency_cross='USD/BRL', from_date='04/11/2019', to_date= '01/01/2024')

JotaSe commented 1 year ago

Any updates?

JotaSe commented 1 year ago

I forked the repo and implemented https://github.com/VeNoMouS/cloudscraper but no success, it can solve the v2 cloudflare challenge

nttams commented 1 year ago

Does anyone try to add a header to the request, sth like {'content-type': 'application/application-json', 'User-Agent': 'Mozilla'} to pretend to be the browser? Thanks.

sampathkar commented 1 year ago

Hope for the best