ultrafunkamsterdam / undetected-chromedriver

Custom Selenium Chromedriver | Zero-Config | Passes ALL bot mitigation systems (like Distil / Imperva/ Datadadome / CloudFlare IUAM)
https://github.com/UltrafunkAmsterdam/undetected-chromedriver
GNU General Public License v3.0
9.83k stars 1.15k forks source link

Linux: (ONLY SOMETIMES) unknown error: cannot connect to chrome at 127.0.0.1:33573 #680

Open own3mall opened 2 years ago

own3mall commented 2 years ago

undetected-chromedriver is an amazing piece of work, but I cannot get it to work reliably in Ubuntu 20.04.

Half the time my script runs, it fails with an error similar to the following:

Major exception Message: unknown error: cannot connect to chrome at 127.0.0.1:33573
from chrome not reachable
Stacktrace:
#0 0x56006ef34f33 <unknown>
#1 0x56006ec7efaf <unknown>
#2 0x56006ec6d209 <unknown>
#3 0x56006eca5a79 <unknown>
#4 0x56006ec9da06 <unknown>
#5 0x56006ecd8d3a <unknown>
#6 0x56006ecd2e63 <unknown>
#7 0x56006eca882a <unknown>
#8 0x56006eca9985 <unknown>
#9 0x56006ef794cd <unknown>
#10 0x56006ef7d5ec <unknown>
#11 0x56006ef6371e <unknown>
#12 0x56006ef7e238 <unknown>
#13 0x56006ef58870 <unknown>
#14 0x56006ef9a608 <unknown>
#15 0x56006ef9a788 <unknown>
#16 0x56006efb4f1d <unknown>
#17 0x7f8270612609 <unknown>

However, this same script works some of the time with undetected-chromedriver, so I don't think it's a python script issue. It seems to not be able to reliably connect to chrome for some reason. Does anyone else have this problem?

I'm running the latest version of undetected-chromedriver, Chrome, and Selenium.

chris-aeviator commented 2 years ago

I started seeing this today for the first time.In my case it fails all the time now. I will have to reboot my system (killed all chrome processes with no effect ) and I'm able to run once before I need another reboot. Before I was running it daily with 100% success rate.

I can see a port mismatch between the port setup by undetected_chromedriver (it also shows a 500 for session) & selenium

2022-06-15 16:24:35,039 DEBUG undetected_chromedriver.patcher getting release number from /LATEST_RELEASE
2022-06-15 16:24:40,209 DEBUG undetected_chromedriver.patcher downloading from https://chromedriver.storage.googleapis.com/102.0.5005.61/chromedriver_linux64.zip
2022-06-15 16:24:45,853 DEBUG undetected_chromedriver.patcher unzipping /tmp/tmpxmlbm4un
2022-06-15 16:24:45,954 INFO undetected_chromedriver.patcher patching driver executable /home/[leftout]/.local/share/undetected_chromedriver/f56bfe10849264a4_chromedriver
2022-06-15 16:24:46,517 DEBUG urllib3.connectionpool Starting new HTTP connection (1): localhost:54189
2022-06-15 16:25:46,562 DEBUG urllib3.connectionpool http://localhost:54189/ "POST /session HTTP/1.1" 500 769

webdriver.log

[1655303086,012][INFO]: Please see https://chromedriver.chromium.org/security-considerations for suggestions on keeping ChromeDriver safe.
[1655303086,520][INFO]: [b62e34d2343cef288210524c0a4f1be2] COMMAND InitSession {
   "capabilities": {
      "alwaysMatch": {
         "browserName": "chrome",
         "goog:chromeOptions": {
            "args": [ "--remote-debugging-host=127.0.0.1", "--remote-debugging-port=37119", "--user-data-dir=/tmp/tmpwzhhyvf4", "--lang=de-DE", "--no-default-browser-check", "--no-first-run", "--log-level=0" ],
            "binary": "/usr/bin/chromium",
            "debuggerAddress": "127.0.0.1:37119",
            "extensions": [  ]
         },
         "pageLoadStrategy": "normal"
      },
      "firstMatch": [ {

      } ]
   },
   "desiredCapabilities": {
      "browserName": "chrome",
      "goog:chromeOptions": {
         "args": [ "--remote-debugging-host=127.0.0.1", "--remote-debugging-port=37119", "--user-data-dir=/tmp/tmpwzhhyvf4", "--lang=de-DE", "--no-default-browser-check", "--no-first-run", "--log-level=0" ],
         "binary": "/usr/bin/chromium",
         "debuggerAddress": "127.0.0.1:37119",
         "extensions": [  ]
      },
      "pageLoadStrategy": "normal"
   }
}
[1655303146,562][INFO]: [b62e34d2343cef288210524c0a4f1be2] RESPONSE InitSession ERROR unknown error: cannot connect to chrome at 127.0.0.1:37119
from chrome not reachable

My chrome & chromedriver versions match and actually I have not changed anything in my project but just tried to create a second project using undetected_chromedriver within the same conda env.

I can see also a new chromedriver being downloaded/patched for each run which seems totally unneccessary

~/.local/share/undetected_chromedriver/
30b225bacd93432e_chromedriver*  58f369178137b0b7_chromedriver*  729fe850740f2501_chromedriver*  80e825f657c95ef4_chromedriver*  8d1a892e9b27b693_chromedriver*  b0f5a9a312764499_chromedriver*  c5182e6000d25fa5_chromedriver*  f56bfe10849264a4_chromedriver*
42415a6a4ce6a491_chromedriver*  5da919029cf4b69b_chromedriver*  7a95e00406c57f0b_chromedriver*  8379d22d1140ff71_chromedriver*  9807223c6a71296c_chromedriver*  b98f60a8c404954e_chromedriver*  e3541ab2f0786a04_chromedriver*  f722d7434b563b96_chromedriver*
4cfa86b1438698ff_chromedriver*  6b755e625e03bcb5_chromedriver*  7fb84b1887d09168_chromedriver*  8866ac9cbe9c3d93_chromedriver*  a871b60535efb05e_chromedriver*  c0f185d390b52e1c_chromedriver*  e4c56ea41e447809_chromedriver*

UPDATE

I can see the same issue when using https://hub.docker.com/r/ultrafunk/undetected-chromedriver

Update 2

It is a very strange bug - I was still using UC hours before it stopped working forever - a reboot does not solve it, I have not changed anything on the machine. Guess Chrome pushed something on their side?!

chris-aeviator commented 2 years ago

@own3mall are you executing via SSH? It turned out for me executing on a bare terminal works while executing over SSH does not - though it did for almost 30 days before without issues.

rtrive commented 2 years ago

I saw the same problem starting yesterday and I'm using https://hub.docker.com/r/ultrafunk/undetected-chromedriver, but from docker hub I see no changes in the dockerfile in 15 days. This problem raise up only on GCP cloud run but not when I run the docker on my machine

own3mall commented 2 years ago

@own3mall are you executing via SSH? It turned out for me executing on a bare terminal works while executing over SSH does not - though it did for almost 30 days before without issues.

I was running it via X2Go which uses SSH, so I guess, yes? It is running in an X session though. It works some of the time, but doesn't always work. Rebooting doesn't seem to matter.

cpatrickalves commented 2 years ago

I am also having the same issue when running inside docker (including when using the image https://hub.docker.com/r/ultrafunk/undetected-chromedriver).

chris-aeviator commented 2 years ago

May I ask @ultrafunkamsterdam for a short feedback as in "I have no freaking Idea why this happens"/ "I might have an idea"/ "I don't have any time" so people affected by this can plan around this issue :pray: ?!

I see myself looking into it and contributing a PR but any hunch would be highly beneficial.

AnkurDahama commented 2 years ago

Looking forward to your possible PR. Having the same issues. Runs 50% of the time. @chris-aeviator

JoonaFinland commented 2 years ago

Having the same issue, for me it also seems to work only half the time. Stopped working about a week ago

rtrive commented 2 years ago

I tried to run into CloudRun and I have this issue. I run it on Heroku for 1 week and everything was good until today

own3mall commented 2 years ago

Also, running headless never works at all for me. Chrome never launches or does anything if you try to run it headless.

chris-aeviator commented 2 years ago

Also, running headless never works at all for me. Chrome never launches or does anything if you try to run it headless.

Headless is not supported as per readme of this repo

Andrej-VB commented 2 years ago

A similar error occurs to me if I don't update my Google Chrome to the latest version. (By opening Chrome, clicking on 3 dots in the upper right corner, Help, About Google Chrome)

If I'm not mistaken, the correct driver downloads itself afterwards on launch, it is just that the official Google Chrome has to be up to date

AnkurDahama commented 2 years ago

Headless is not supported as per readme of this repo

I’m running it on an EC2 instance with xvfb. So technically not headless. Could that cause problems?

sebdelsol commented 2 years ago

My chrome & chromedriver versions match

Sure but this kind of issue is often due to a main version mismatch between the driver and the browser : You'll see it raised here and there when Google decide to do a major version bump.

You can check the debug log as you did with logging.basicConfig(level=logging.DEBUG)

Anyway there's a workaround for this recurring issue. It's only meant for Windows though. But you can probably do something like that on Linux : (EDIT) Fix the code to get the Chrome version as suggested by @tstoco.

import os
import undetected_chromedriver as uc

def get_chrome_main_version():
    chrome_path = uc.find_chrome_executable()
    bare_version = os.popen(f"{chrome_path} --version").read()
    return bare_version.strip("Google Chrome").split('.')[0]

if __name__ == "__main__":
    # damn Google auto update, we'll get the Chrome current version anyway :
    version_main = get_chrome_main_version()

    # force the driver to download the same main version :
    driver = uc.Chrome(version_main=version_main)

I can see also a new chromedriver being downloaded/patched for each run which seems totally unnecessary

Here's the UC's author answer about that. :grin:

I can see a port mismatch between the port setup by undetected_chromedriver & Selenium

You indeed see 2 different ports :

Also, running headless never works at all for me. Chrome never launches or does anything if you try to run it headless. Headless is not supported as per readme of this repo

Headless works fine but there's no guarantee about detection since there are many ways to detect an headless browser and you'll usually be an evasion behind the most agressive bot detection vendors.

chris-aeviator commented 2 years ago

Headless is not supported as per readme of this repo

I’m running it on an EC2 instance with xvfb. So technically not headless. Could that cause problems?

as longs as you don't let Chrome start with the ---headless argument

elandorr commented 2 years ago

Chrome won't run in docker unless you turn off the sandbox and such. Did you check that? That was the problem here.

chris-aeviator commented 2 years ago

To recap: This issue is not about

it is about previously running code erroring for a given amount of people all on the same day.

I propose for keeping a better overview (many people getting involved right now), we should keep other discussions to their respective issues.

elandorr commented 2 years ago

The original post states:

Half the time my script runs, it fails with an error similar to the following:

I can reproduce this and fix it. Google explains this on a FAQ. You may never see an error if you get lucky and stay just below the limit. Sites change, chromedriver changes.

UPDATE

I can see the same issue when using https://hub.docker.com/r/ultrafunk/undetected-chromedriver

If you look into that container you'll see one reason why. He hasn't set all the required flags. For a single small test it shouldn't matter, but apparently res usage isn't consistent.

for a given amount of people all on the same day.

A handful of people are hardly statistically significant, but you may experience Google finally pushing fixes they promised. (Such as this breaking bug which happens to be random as well.) UC always re-downloads on every session, so you're stuck with a little entropy, unless you modify the script and keep it static.

chris-aeviator commented 2 years ago

I can see the same issue when using https://hub.docker.com/r/ultrafunk/undetected-chromedriver

this is titled “update” since it was a try to get around this issue. The issue does persist outside of docker and I only used docker here as a means to double check it’s not my system causing this

41v4 commented 2 years ago

I am having the same issue. Did anyone find a solution?

chris-aeviator commented 2 years ago

try export DISPLAY=:0 (or %env DISPLAY=:0 in jupyter) before running your script

timakovi commented 2 years ago

Hi guys! Try import undetected_chromedriver as uc options = uc.ChromeOptions() options.arguments.extend(["--no-sandbox", "--disable-setuid-sandbox"]) # << this driver = uc.Chrome(options)

own3mall commented 2 years ago

Hi guys! Try import undetected_chromedriver as uc options = uc.ChromeOptions() options.arguments.extend(["--no-sandbox", "--disable-setuid-sandbox"]) # << this driver = uc.Chrome(options)

This didn't work for me unfortunately. Same issue as before. It works sometimes and not others.

kSinghParth commented 2 years ago

I am facing a similar issue as everyone else here. However, when the script was actually running successfully(and then one day it decided not to), i had the headless argument. So i guess, even having the --headless argument doesn't hurt. FYI running the script on a headless remote server.

chris-aeviator commented 2 years ago

It very much seems to me that what @elandorr suggested is the case - chrome team pushing updates (without properly representing that in semver) and the download on each run of UC fetching updates that break code that was previously working. A strategy for me was to isolate my issue and then make sure to save and load a working chrome driver with my workaround. The issues seem various but all expressed in the same “could not connect error”

emibonezzi commented 2 years ago

It very much seems to me that what @elandorr suggested is the case - chrome team pushing updates (without properly representing that in semver) and the download on each run of UC fetching updates that break code that was previously working. A strategy for me was to isolate my issue and then make sure to save and load a working chrome driver with my workaround. The issues seem various but all expressed in the same “could not connect error”

Did you find a solution? You added a custom executable path to your uc.Chrome?

lvzenglei commented 2 years ago

I update chrome to version 105 and install undetected_chromedriver Then run below code

driver = uc.Chrome(options=options)
options = uc.ChromeOptions()
options.arguments.extend(["--no-sandbox", "--disable-setuid-sandbox","--headless"]) 
driver = uc.Chrome(options=options)

and I solve it and I don't know whether it can solve yours

westonplatter commented 2 years ago

I tried a modified version of @lvzenglei's solution, and it worked for me in docker. Couple of notes,

  1. In my Dockerfile, I made sure I was on an updated version of Chrome,

    RUN apt update && apt install -y chromium chromium-driver
  2. And the once in the docker container, (eg, docker-compose run mycontainer /bin/bash)

    $ python
    >>> import undetected_chromedriver as uc
    >>> options = uc.ChromeOptions()
    >>> options.arguments.extend(["--no-sandbox", "--disable-setuid-sandbox","--headless", "--disable-dev-shm-usage"])
    >>> driver = uc.Chrome(options=options)
    >>> driver.get("https://google.com")
estromenko commented 1 year ago

Having the same issue. The problem appears only sometimes inside docker container, I cannot reproduce it on my local machine. I am pretty sure that chrome and chromedriver major versions match because version_main option is used. Additionally, my app has /dev/shm volume. Interesting fact: if I run my app inside the kubernetes cluster, the problem appears mush oftener. I have to redeploy my app every day because after some time the issue appears every time.

timakovi commented 1 year ago

I think that problem raise up if OS local machine not Linux :) Maybe problem can be resolved to build a multi-architecture (or not Linux) docker image? example: https://blog.jaimyn.dev/how-to-build-multi-architecture-docker-images-on-an-m1-mac/

estromenko commented 1 year ago

The only difference is that in the container I use Debian, but my local OS is Linux Mint. I don't think this is related to the main issue.

sdfgsdfgd commented 1 year ago

it seems to be Debian related, there are wrappers around chrome's alternative binaries on Debian, /etc/alternatives/google-chrome etc .. sometimes it's choosing to launch chromium for some reason, when you launch a script multiple times, almost half of the time 40-60% .

setting the browser binary with ChromeOptions works 👍

estromenko commented 1 year ago

it seems to be Debian related, there are wrappers around chrome's alternative binaries on Debian, /etc/alternatives/google-chrome etc .. sometimes it's choosing to launch chromium for some reason, when you launch a script multiple times, almost half of the time 40-60% .

setting the browser binary with ChromeOptions works +1

There is find_chrome_executable function in source code that chooses the first available executable from this values: google-chrome, chromium, chromium-browser, chrome, google-chrome-stable. Maybe this is about case you mentioned? Can you describe how did you find it out?