alvarobartt / investpy

Financial Data Extraction from Investing.com with Python
https://investpy.readthedocs.io/
MIT License
1.59k stars 374 forks source link

πŸ› Fix issues with the latest API used by Investing.com to pull data #602

Open alvarobartt opened 1 year ago

alvarobartt commented 1 year ago

πŸ› Bug Fixes

jmizgajski commented 1 year ago

@alvarobartt nice work with the fix! Any ETA on when this may be merged to master and updated in pip?

ymyke commented 1 year ago

I also still get the 403 error. Are you still on it @alvarobartt ? Do you think you can fix it? – Is it working with you, @jmizgajski ?

hscho77 commented 1 year ago

@alvarobartt

When will you fix "get_currency_cross_historical_data"?

File "D:\Anaconda3\lib\site-packages\investpy\currency_crosses.py", line 674, in get_currency_cross_historical_data raise ConnectionError( ConnectionError: ERR#0015: error 403, try again later.

alvarobartt commented 1 year ago

Hi everyone! So it was working but it lasted just some requests, then it got blocked, not due to any sort of new Investing.com protection against investpy, but actually due to Cloudflare, as they included protection in the latest release of Investing.com, so the previous API was deprecated and the current one is protected... I'll keep on exploring but it seems that there's nothing left to test 😞

ymyke commented 1 year ago

Thanks for trying, @alvarobartt !

mic-user commented 1 year ago

@alvarobartt For your information to make sure, the ajax api https://www.investing.com/instruments/HistoricalDataAjax still working only when user operate web browser. Cloudflare's WAF started to block direct http connection by script or automation.

LuisSousaSilva commented 1 year ago

Hi everyone! So it was working but it lasted just some requests, then it got blocked, not due to any sort of new Investing.com protection against investpy, but actually due to Cloudflare, as they included protection in the latest release of Investing.com, so the previous API was deprecated and the current one is protected... I'll keep on exploring but it seems that there's nothing left to test 😞

Have you tested a sleep of 1 to 5 seconds between calls? I have been using that with investpy. Maybe it will not trigger the Cloudflare's WAF. If not possible some quotes are still better than none. Thanks for the work BTW

vid1998 commented 1 year ago

So, will investpy become a dead project? I hope not so, since I wrote a lot of code depending on that beautiful library. Perhaps a solution can be found that is also slightly profitable for investing.com, so that they continue to maintain reasonable access to their data through known end-points. In any case, thanks a lot for the work done so far on this project.

RPDev2002 commented 1 year ago

Thank you for taking a look @alvarobartt - am using investpy for my own analysis and found it awesome! Hope there can be a way round...

Belbute commented 1 year ago

I was using this to get data for my thesis, thankfully I had the whole data backed up but now I'll have to rewrite a lot of functions. Thanks for trying to solve this problem and for creating this amazing library in the first place. Hope this somehow gets solved.

sampathkar commented 1 year ago

You have created something wonderful. It was soo useful for me to carryout various analysis using investing.com. ( investing.com was the only site provided the data I needed) . I am sincerely hoping that you will be able to find a way around this issue. Hoping for the best.

jmizgajski commented 1 year ago

@alvarobartt have you tried allowing requests through a paid proxy provider? this could be a workaround for those willing to pay a proxy provider. Many scraping libs do it.

Also if you need to simulate the browser maybe something like https://github.com/pyppeteer/pyppeteer could come in handy. This is a puppeteer clone but pure python, installable from pip.

RPDev2002 commented 1 year ago

@alvarobartt have you tried allowing requests through a paid proxy provider? this could be a workaround for those willing to pay a proxy provider. Many scraping libs do it.

Also if you need to simulate the browser maybe something like https://github.com/pyppeteer/pyppeteer could come in handy. This is a puppeteer clone but pure python, installable from pip.

If this was an option I would be interested..

RPDev2002 commented 1 year ago

Interesting that this seems to have appeared on Investing Com 6 days ago about not having an API..

https://www.investing-support.com/hc/en-gb/articles/115005473825-Do-you-provide-an-API-

ymyke commented 1 year ago

This guy is using Puppeteer – apparently with some success: https://github.com/DavideViolante/investing-com-api/issues/68#issuecomment-1253834929 Maybe worth a try, @alvarobartt ?

kshabahang commented 1 year ago

Hi everyone! So it was working but it lasted just some requests, then it got blocked, not due to any sort of new Investing.com protection against investpy, but actually due to Cloudflare, as they included protection in the latest release of Investing.com, so the previous API was deprecated and the current one is protected... I'll keep on exploring but it seems that there's nothing left to test disappointed

I just want to thank you for your application. It really feels like Cloudflare will stop at nothing to ruin all the fun on the internet.

GSLabIt commented 1 year ago

This guy is using Puppeteer – apparently with some success: DavideViolante/investing-com-api#68 (comment) Maybe worth a try, @alvarobartt ?

sound like it works using https://github.com/pyppeteer/pyppeteer, need to change api call a bit, but works

DavideViolante commented 1 year ago

Yeah, it seems to work using Puppeteer, here the changes between the previous version if you need. Also remember to set a user agent or it keeps not working (Cloudflare makes a good job on blocking some traffic). Idk Python much so I can't help much more than this.

GSLabIt commented 1 year ago

Yeah, it seems to work using Puppeteer, here the changes between the previous version if you need. Also remember to set a user agent or it keeps not working (Cloudflare makes a good job on blocking some traffic). Idk Python much so I can't help much more than this.

Thanks, i used it to make it work in python, here is the code:


    AGENTS = (
        'Mozilla/5.0 (Windows NT 10.0; Win64; x64) '
        'AppleWebKit/537.36 (KHTML, like Gecko) '
        'Chrome/105.0.0.0 Safari/537.36'
    )

    URL = (
        'https://api.investing.com/api/financialdata/{id}'
        '/historical/chart?period={p}&interval={i}&pointscount=120'
    )

    async def _get_investing_com_data(self, symbol, period='P1M', interval='P1D'):
        url = URL.format(id=symbol, p=period, i=interval)
        browser = await launch(
            handleSIGINT=False, handleSIGTERM=False, handleSIGHUP=False
        )
        page = await browser.newPage()
        await page.setUserAgent(AGENTS)
        await page.goto(url)
        element = await page.querySelector('body')
        content = await page.evaluate('(element) => element.textContent', element)
        await browser.close()
        return json.loads(content).get('data', False)

Do you know how the endpoint to pass startDate and endDate?

sahelanthropussy commented 1 year ago

    AGENTS = (
        'Mozilla/5.0 (Windows NT 10.0; Win64; x64) '
        'AppleWebKit/537.36 (KHTML, like Gecko) '
        'Chrome/105.0.0.0 Safari/537.36'
    )

    URL = (
        'https://api.investing.com/api/financialdata/{id}'
        '/historical/chart?period={p}&interval={i}&pointscount=120'
    )

    async def _get_investing_com_data(self, symbol, period='P1M', interval='P1D'):
        url = URL.format(id=symbol, p=period, i=interval)
        browser = await launch(
            handleSIGINT=False, handleSIGTERM=False, handleSIGHUP=False
        )
        page = await browser.newPage()
        await page.setUserAgent(AGENTS)
        await page.goto(url)
        element = await page.querySelector('body')
        content = await page.evaluate('(element) => element.textContent', element)
        await browser.close()
        return json.loads(content).get('data', False)

This code fails for me and returns False. trying to return json.loads(content) alone gives the error: {'@errors': ['Core API respond with invalid status: 500']}

alvarobartt commented 1 year ago

    AGENTS = (
        'Mozilla/5.0 (Windows NT 10.0; Win64; x64) '
        'AppleWebKit/537.36 (KHTML, like Gecko) '
        'Chrome/105.0.0.0 Safari/537.36'
    )

    URL = (
        'https://api.investing.com/api/financialdata/{id}'
        '/historical/chart?period={p}&interval={i}&pointscount=120'
    )

    async def _get_investing_com_data(self, symbol, period='P1M', interval='P1D'):
        url = URL.format(id=symbol, p=period, i=interval)
        browser = await launch(
            handleSIGINT=False, handleSIGTERM=False, handleSIGHUP=False
        )
        page = await browser.newPage()
        await page.setUserAgent(AGENTS)
        await page.goto(url)
        element = await page.querySelector('body')
        content = await page.evaluate('(element) => element.textContent', element)
        await browser.close()
        return json.loads(content).get('data', False)

This code fails for me and returns False. trying to return json.loads(content) alone gives the error: {'@errors': ['Core API respond with invalid status: 500']}

Hi @sahelanthropussy, I've released investiny as a temporary replacement for investpy while I wait to get a response from Investing.com, so use it in the meantime instead! Thanks πŸ€—

P.S. If you have any doubts don't hesitate asking me, and open all the related feature requests and bug reports at https://github.com/alvarobartt/investiny/issues or open a discussion at https;//github.com/alvarobartt/investiny/discussions

alvarobartt commented 1 year ago

Hi @GSLabIt please check the comment above regarding the release of investiny, thanks πŸ€—

RPDev2002 commented 1 year ago

Ah wow thank you alvarobartt... will try it out soon..

GSLabIt commented 1 year ago

Hi @GSLabIt please check the comment above regarding the release of investiny, thanks πŸ€—

Hi will have a look. Any chance to have it compatible with python 3.8?

alvarobartt commented 1 year ago

Hi @GSLabIt please check the comment above regarding the release of investiny, thanks πŸ€—

Hi will have a look. Any chance to have it compatible with python 3.8?

Sure! Feel free to open an issue at https://github.com/alvarobartt/investiny/issues and I'll try to add it ASAP πŸ‘πŸ»

alvarobartt commented 1 year ago

Ah wow thank you alvarobartt... will try it out soon..

Cool @RPDev2002, thank you too!

alesegura96 commented 1 year ago

Good morning @alvarobartt, thank you so much for the investpy code it is a really helpful tool. I has been used the following Investing.com url for a personal finance project inspired by your code:

url = "https://www.investing.com/instruments/HistoricalDataAjax"

I have tried to change my code to the:

url = f"https://api.investing.com/api/financialdata/historical/{id_}"

But I keep getting the ERR:403 as you have mentioned in the different posts. Have you achieved to make it work with the new investing.com API?

Thank you so much in advance.

PD: In case you havenΒ΄t figure out yet I will start trying with the code from investiny

alvarobartt commented 1 year ago

Hi @alesegura96, as you point out, both Investing.com APIs https://www.investing.com/instruments/HistoricalDataAjax and https://api.investing.com/api/financialdata/historical are not working fine as those block all the incoming requests with HTTP 403... So I'd suggest you and everyone to use investiny in the meantime, as it's currently stable (just fixed a critical bug) and performs well.

Let's see if I can get to contact Investing.com, and get a response from them so as to continue the development of investpy either as a personal project or as an Investing.com project...

DavideViolante commented 1 year ago

Did you try using Puppeteer as we mentioned before? @alvarobartt I think it's the smallest change to make it work, at least it was for my project

ankit2788 commented 1 year ago

Hi, This seems to be still an open issue. Can you please confirm whether there is a working fix for this? On a side note, investpy has been a quite helpful. Commendable job for maintaining this till now!

SoundsSerious commented 1 year ago

Hi,

Thanks for your work on this library.

Seems like these issues are popping up because of the differences between the scraping requests and their "legitimate" web requests. Its seems hardcoding a connection on our is part is the problem, since its easy to block a whole group of people using the same connection behavior.

Would the most flexible strategy here be to allow for a definable user connection so they can modify their headers, ect to seem passible to the investing.com servers?

alvarobartt commented 1 year ago

Hi,

Thanks for your work on this library.

Seems like these issues are popping up because of the differences between the scraping requests and their "legitimate" web requests. Its seems hardcoding a connection on our is part is the problem, since its easy to block a whole group of people using the same connection behavior.

Would the most flexible strategy here be to allow for a definable user connection so they can modify their headers, ect to seem passible to the investing.com servers?

So I don't know how strict is Investing.com protection right now, but even with custom requests objects, you may be blocked at some point, since some people tried to connect from VPNs that never sent any request to Investing.com and those were also blocked...

ymyke commented 3 months ago

Have you ever tried https://github.com/yifeikong/curl_cffi, @alvarobartt ?