Anorov / cloudflare-scrape

A Python module to bypass Cloudflare's anti-bot page.
MIT License
3.4k stars 462 forks source link

Unable to identify Cloudflare IUAM Javascript on website #420

Open joinwick opened 3 years ago

joinwick commented 3 years ago

Before creating an issue, first upgrade cfscrape with pip install -U cfscrape and see if you're still experiencing the problem. Please also confirm your Node version (node --version or nodejs --version) is version 10 or higher.

Make sure the website you're having issues with is actually using anti-bot protection by Cloudflare and not a competitor like Imperva Incapsula or Sucuri. And if you're using an anonymizing proxy, a VPN, or Tor, Cloudflare often flags those IPs and may block you or present you with a captcha as a result.

Please confirm the following statements and check the boxes before creating an issue:

Python version number:3.6.0

cfscrape version number:2.1.1

Code snippet involved with the issue

scraper = cfscrape.create_scraper(delay=5) print(scraper.get(session_url).content)

Complete exception and traceback

(If the problem doesn't involve an exception being raised, leave this blank)

ValueError: Unable to identify Cloudflare IUAM Javascript on website. Cloudflare may have changed their technique, or there may be a bug in the script.

URL of the Cloudflare-protected page: https://cn.investing.com/

[LINK GOES HERE]

URL of Pastebin/Gist with HTML source of protected page

[LINK GOES HERE]

Err0neus commented 3 years ago

I confirm:

Python version: 3.9.4 NodeJS: v14.8.0 cfscrape: 2.1.1

code snippet involving the issue: import requests import cfscrape session = requests.Session() scraper = cfscrape.create_scraper(sess=session) res = scraper.get('https://www.discogs.com/artist/125246-Nirvana')

complete exception and traceback

`_--------------------------------------------------------------------------- AttributeError Traceback (most recent call last) ~\anaconda3\lib\site-packages\cfscrape__init__.py in solve_challenge(self, body, domain) 254 r"(?:[^{<>]},\s(\d{4,}))?", --> 255 javascript, flags=re.S 256 ).groups()

AttributeError: 'NoneType' object has no attribute 'groups'

During handling of the above exception, another exception occurred:

ValueError Traceback (most recent call last)

in ----> 1 res = scraper.get('https://www.discogs.com/artist/125246-Nirvana') ~\anaconda3\lib\site-packages\requests\sessions.py in get(self, url, **kwargs) 544 545 kwargs.setdefault('allow_redirects', True) --> 546 return self.request('GET', url, **kwargs) 547 548 def options(self, url, **kwargs): ~\anaconda3\lib\site-packages\cfscrape\__init__.py in request(self, method, url, *args, **kwargs) 127 # Check if Cloudflare anti-bot "I'm Under Attack Mode" is enabled 128 if self.is_cloudflare_iuam_challenge(resp): --> 129 resp = self.solve_cf_challenge(resp, **kwargs) 130 131 return resp ~\anaconda3\lib\site-packages\cfscrape\__init__.py in solve_cf_challenge(self, resp, **original_kwargs) 202 203 # Solve the Javascript challenge --> 204 answer, delay = self.solve_challenge(body, domain) 205 if method == 'POST': 206 cloudflare_kwargs["data"]["jschl_answer"] = answer ~\anaconda3\lib\site-packages\cfscrape\__init__.py in solve_challenge(self, body, domain) 290 raise ValueError( 291 "Unable to identify Cloudflare IUAM Javascript on website. %s" --> 292 % BUG_REPORT 293 ) 294 ValueError: Unable to identify Cloudflare IUAM Javascript on website. Cloudflare may have changed their technique, or there may be a bug in the script. Please read https://github.com/Anorov/cloudflare-scrape#updates, then file a bug report at https://github.com/Anorov/cloudflare-scrape/issues."_`
SpangleLabs commented 3 years ago

This library is unmaintained and dead, see #406

Err0neus commented 3 years ago

This library is unmaintained and dead, see #406

Thanks for your response. I have noticed that issue only after my comment.

reedniv commented 3 years ago

This library is unmaintained and dead, see #406

Also https://github.com/VeNoMouS/cloudscraper/ just republish but script still not work

andress134 commented 3 years ago

This library is unmaintained and dead, see #406

Also https://github.com/VeNoMouS/cloudscraper/ just republish but script still not work

I just saw on twitter that 1 week ago, cloudflare posted an announcement about '' cloudscraper '' saying that the new update destroyed cloudscraper forever I think that the cloudscraper admin @VeNoMouS can no longer fix this public repo, probably only the version paid by him will keep it updated

a new method would be to use puppeteer, I found this version of cloudscraper based on puppeteer https://github.com/ryxnSZN/cloudscraper it needs a few updates to bypass the IUAM page

Cloudflare no longer uses the cfduid cookie in headers (set-cookie)

// https://github.com/scaredos/cfresearch here some info about new challenge

VeNoMouS commented 3 years ago

I choose not to update the public one, its not a case of "can no longer fix"...

The paid subscription code works fine... (find me on discord if you wish to inquire)