Anorov / cloudflare-scrape

A Python module to bypass Cloudflare's anti-bot page.
MIT License
3.35k stars 458 forks source link

Unable to identify Cloudflare IUAM Javascript on Extrem-Down #353

Open mdubrois opened 4 years ago

mdubrois commented 4 years ago

Before creating an issue, first upgrade cfscrape with pip install -U cfscrape and see if you're still experiencing the problem. Please also confirm your Node version (node --version or nodejs --version) is version 10 or higher.

Make sure the website you're having issues with is actually using anti-bot protection by Cloudflare and not a competitor like Imperva Incapsula or Sucuri. And if you're using an anonymizing proxy, a VPN, or Tor, Cloudflare often flags those IPs and may block you or present you with a captcha as a result.

Please confirm the following statements and check the boxes before creating an issue:

Python version number

Run python --version and paste the output below:

Python 2.7.12

cfscrape version number

Run pip show cfscrape and paste the output below:

Name: cfscrape
Version: 2.1.1
Summary: A simple Python module to bypass Cloudflare's anti-bot page. See https://github.com/Anorov/cloudflare-scrape for more information.
Home-page: https://github.com/Anorov/cloudflare-scrape
Author: Anorov
Author-email: anorov.vorona@gmail.com
License: UNKNOWN
Location: /usr/lib/python2.7/site-packages
Requires: requests
Required-by:

Code snippet involved with the issue

# coding: utf8
"""This module is used for ..."""
from __future__ import absolute_import, division, print_function, unicode_literals
from builtins import *  # noqa

import logging
import cfscrape

handler = logging.StreamHandler()
LOG = logging.getLogger(__name__)
LOG.addHandler(handler)
LOG.setLevel(logging.INFO)

def main():
    url = "https://www.extreme-down.ninja/films-new-hd/page/1/"
    scraper = cfscrape.create_scraper(delay=10)
    LOG.info(url)
    content = scraper.get(url).content
    LOG.info(content)

if __name__ == "__main__":
    main()

Complete exception and traceback

(If the problem doesn't involve an exception being raised, leave this blank)

Traceback (most recent call last):
  File "./test_scraper.py", line 25, in <module>
    main()
  File "./test_scraper.py", line 19, in main
    content = scraper.get(url).content
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/requests/sessions.py", line 546, in get
    return self.request('GET', url, **kwargs)
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/cfscrape/__init__.py", line 129, in request
    resp = self.solve_cf_challenge(resp, **kwargs)
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/cfscrape/__init__.py", line 204, in solve_cf_challenge
    answer, delay = self.solve_challenge(body, domain)
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/cfscrape/__init__.py", line 292, in solve_challenge
    % BUG_REPORT
ValueError: Unable to identify Cloudflare IUAM Javascript on website. Cloudflare may have changed their technique, or there may be a bug in the script.

Please read https://github.com/Anorov/cloudflare-scrape#updates, then file a bug report at https://github.com/Anorov/cloudflare-scrape/issues."

URL of the Cloudflare-protected page

https://www.extreme-down.ninja/films-new-hd/page/1/

URL of Pastebin/Gist with HTML source of protected page

https://gist.github.com/mdubrois/dbf6c90bf299b0605160b776ce62a3eb

JohnTravolski commented 4 years ago

Same problem.

MostafaWahdan commented 4 years ago

I have the same issue