seleniumbase / SeleniumBase

📊 Python's all-in-one framework for web crawling, scraping, testing, and reporting. Supports pytest. UC Mode provides stealth. Includes many tools.
https://seleniumbase.io
MIT License
4.45k stars 908 forks source link

Cannot bypass the check box of the CF of this website #2886

Closed Orbiszeus closed 2 days ago

Orbiszeus commented 2 days ago

This is where the iframe is located: <

Screenshot 2024-06-30 at 15 00 00

This is the website:

Screenshot 2024-06-30 at 15 03 37

I am having a problem bypassing this CF checkbox in this website. I am using the newly updated functions. When the browser opens the newly updated uc_click() method which is now handled by uc_gui_handle_cf() does not work. What can I do to get through this? The iframe located in this id="cf-chl-widget-om6tb". My code structure looks like this :


  with SB(uc=True, test=True, rtf=True,incognito=True, agent=agent, headless=False) as sb:
      sb.driver.uc_open_with_disconnect(url, 6)
      # sb.scroll_to("iframe")
      sb.uc_gui_handle_cf("#cf-chl-widget-om6tb iframe")
      sb.set_messenger_theme(location="bottom_center")
      sb.post_message("SeleniumBase wasn't detected!")

Exception I am getting :


urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x111201e40>: Failed to establish a new connection: [Errno 61] Connection refused

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/Users/orbiszeus/.vscode/extensions/ms-python.debugpy-2024.6.0-darwin-x64/bundled/libs/debugpy/adapter/../../debugpy/launcher/../../debugpy/__main__.py", line 39, in <module>
    cli.main()
  File "/Users/orbiszeus/.vscode/extensions/ms-python.debugpy-2024.6.0-darwin-x64/bundled/libs/debugpy/adapter/../../debugpy/launcher/../../debugpy/../debugpy/server/cli.py", line 430, in main
    run()
  File "/Users/orbiszeus/.vscode/extensions/ms-python.debugpy-2024.6.0-darwin-x64/bundled/libs/debugpy/adapter/../../debugpy/launcher/../../debugpy/../debugpy/server/cli.py", line 284, in run_file
    runpy.run_path(target, run_name="__main__")
  File "/Users/orbiszeus/.vscode/extensions/ms-python.debugpy-2024.6.0-darwin-x64/bundled/libs/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_runpy.py", line 321, in run_path
    return _run_module_code(code, init_globals, run_name,
  File "/Users/orbiszeus/.vscode/extensions/ms-python.debugpy-2024.6.0-darwin-x64/bundled/libs/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_runpy.py", line 135, in _run_module_code
    _run_code(code, mod_globals, init_globals,
  File "/Users/orbiszeus/.vscode/extensions/ms-python.debugpy-2024.6.0-darwin-x64/bundled/libs/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_runpy.py", line 124, in _run_code
    exec(code, run_globals)
  File "/Users/orbiszeus/metro_analyst/menu_crawler.py", line 570, in <module>
    main()
  File "/Users/orbiszeus/metro_analyst/menu_crawler.py", line 545, in main
    sb.uc_gui_handle_cf("#cf-chl-widget-om6tb iframe")
  File "/Users/orbiszeus/metro_analyst/myenv/lib/python3.10/site-packages/seleniumbase/core/browser_launcher.py", line 4031, in <lambda>
    lambda *args, **kwargs: uc_gui_handle_cf(
  File "/Users/orbiszeus/metro_analyst/myenv/lib/python3.10/site-packages/seleniumbase/core/browser_launcher.py", line 653, in uc_gui_handle_cf
    source = driver.get_page_source()
  File "/Users/orbiszeus/metro_analyst/myenv/lib/python3.10/site-packages/seleniumbase/core/sb_driver.py", line 50, in get_page_source
    return self.driver.page_source
  File "/Users/orbiszeus/metro_analyst/myenv/lib/python3.10/site-packages/seleniumbase/undetected/__init__.py", line 330, in __getattribute__
    return super().__getattribute__(item)
  File "/Users/orbiszeus/metro_analyst/myenv/lib/python3.10/site-packages/selenium/webdriver/remote/webdriver.py", line 455, in page_source
    return self.execute(Command.GET_PAGE_SOURCE)["value"]
  File "/Users/orbiszeus/metro_analyst/myenv/lib/python3.10/site-packages/selenium/webdriver/remote/webdriver.py", line 352, in execute
    response = self.command_executor.execute(driver_command, params)
  File "/Users/orbiszeus/metro_analyst/myenv/lib/python3.10/site-packages/selenium/webdriver/remote/remote_connection.py", line 302, in execute
    return self._request(command_info[0], url, body=data)
  File "/Users/orbiszeus/metro_analyst/myenv/lib/python3.10/site-packages/selenium/webdriver/remote/remote_connection.py", line 322, in _request
    response = self._conn.request(method, url, body=body, headers=headers)
  File "/Users/orbiszeus/metro_analyst/myenv/lib/python3.10/site-packages/urllib3/_request_methods.py", line 136, in request
    return self.request_encode_url(
  File "/Users/orbiszeus/metro_analyst/myenv/lib/python3.10/site-packages/urllib3/_request_methods.py", line 183, in request_encode_url
    return self.urlopen(method, url, **extra_kw)
  File "/Users/orbiszeus/metro_analyst/myenv/lib/python3.10/site-packages/urllib3/poolmanager.py", line 444, in urlopen
    response = conn.urlopen(method, u.request_uri, **kw)
  File "/Users/orbiszeus/metro_analyst/myenv/lib/python3.10/site-packages/urllib3/connectionpool.py", line 877, in urlopen
    return self.urlopen(
  File "/Users/orbiszeus/metro_analyst/myenv/lib/python3.10/site-packages/urllib3/connectionpool.py", line 877, in urlopen
    return self.urlopen(
  File "/Users/orbiszeus/metro_analyst/myenv/lib/python3.10/site-packages/urllib3/connectionpool.py", line 877, in urlopen
    return self.urlopen(
  File "/Users/orbiszeus/metro_analyst/myenv/lib/python3.10/site-packages/urllib3/connectionpool.py", line 847, in urlopen
    retries = retries.increment(
  File "/Users/orbiszeus/metro_analyst/myenv/lib/python3.10/site-packages/urllib3/util/retry.py", line 515, in increment
    raise MaxRetryError(_pool, url, reason) from reason  # type: ignore[arg-type]
urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='localhost', port=53964): Max retries exceeded with url: /session/33d97a286c7d6d58961e0ec9b48dcc5f/source (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x111201e40>: Failed to establish a new connection: [Errno 61] Connection refused'))
mdmintz commented 2 days ago

I'm running this:

from seleniumbase import SB

with SB(uc=True) as sb:
    url = "https://www.yemeksepeti.com/city/istanbul/area/besiktas-etiler-mah"
    sb.driver.uc_open_with_reconnect(url, 6)
    sb.uc_gui_handle_cf()
    sb.highlight('span[aria-label="yemeksepeti"]')
    breakpoint()

And seeing this:

Screenshot 2024-06-30 at 9 13 23 AM


seleniumbase version: 4.28.2 urllib3 version: 2.2.2

Orbiszeus commented 2 days ago

Thank you so much for quick response, however now it is asking me another reCaptcha which looks like this:

Screenshot 2024-06-30 at 15 42 48

And the html content of the iframe is : `

`
Orbiszeus commented 2 days ago

Also, in the same webpage there are many restaurant links as you can see. I was wondering how can I click those href links inside those divs with seleniumbase?

mdmintz commented 2 days ago

When I went directly to https://www.yemeksepeti.com/en/, I didn't get the reCAPTCHA.

from seleniumbase import SB

with SB(uc=True) as sb:
    url = "https://www.yemeksepeti.com/en/"
    sb.driver.uc_open_with_reconnect(url, 6)
    sb.uc_gui_handle_cf()
    sb.highlight('span[aria-label="yemeksepeti"]')
    breakpoint()

If you do hit reCAPTCHA, you'll need to solve the audio challenge with an external repo: https://github.com/search?q=pydub.AudioSegment.from_mp3+recaptcha+solver+language%3APython&type=code

As for clicking the links, there's a lot of them depending on your location:

len(sb.find_elements("li.vendor-tile-new-m a"))
45

You can click one like this:

sb.click_nth_visible_element("li.vendor-tile-new-m a", 5)
Orbiszeus commented 2 days ago

Again, thank you so so much!!

Orbiszeus commented 2 days ago

Also I am asking a lot :( However where is the documentation or method descriptions of the find_elements, click_nth_visible_element, etc. ? Thank you!!!

mdmintz commented 2 days ago

List of methods: SeleniumBase/help_docs/method_summary.md

Lots of existing examples to learn from: SeleniumBase/examples