SeleniumHQ / selenium

A browser automation framework and ecosystem.
https://selenium.dev
Apache License 2.0
30.82k stars 8.21k forks source link

[šŸ› Bug]: OSError Text file busy on .cache/selenium #13511

Open anteph opened 10 months ago

anteph commented 10 months ago

What happened?

Hello!

I'm using Selenium with Chrome driver to download some web pages. I'm running the browser in headless mode. This piece of code is very basic and roughly like:

from selenium import webdriver

...
driver = webdriver.Chrome()

driver.get("...some url")

driver.quit()

So, nothing special about it I believe.

The bit of the application that uses Selenium is running on Celery. The Celery worker is using a fork model, meaning that I have couple of processes (not threads) in each worker.

Sporadically, I've been seeing an OSError: text file busy on ~/.cache/selenium/chromedriver/linux64/121.0.6167.85/chromedriver.

The issue always occurs on the instantiation of the driver: driver = webdriver.Chrome()

A couple of observations:

Given this seems to happen only once when a new pod launches, under load, and never repeats again, and that it happens on the instantiation of the driver, my suspicion is that there is some race condition on the Selenium Manager. As far as I've understood, Selenium Manager is the piece that is responsible for checking if the Chrome Driver .exe is already in the cache and if not, copy it to there. Maybe two processes are looking to the folder at the same time, see no .exe, and both try to copy it to there, causing the issue?

I've also seen this issue recently when running some of my unit tests on the same logic from above (although in this case I'm testing directly the function that downloads the pages, not the celery worker). But again, seems to be some parallelization related issue.

I've seen a couple of closed issues which mention a similar text file busy error, but they are closed by now and I think they are not exactly the same use case, but I apologize if I misunderstood and this is related.

Thank you :)

How can we reproduce the issue?

I cannot reproduce it myself, it happens very sporadically as described above, and usually under load.

Relevant log output

I'm not collecting any logs, but I think the exception mentioned above is more relevant.

Operating System

Linux - Debian 11 (running in Docker)

Selenium version

Python - 4.17.2

What are the browser(s) and version(s) where you see this issue?

Chrome 121.0.6167.85

What are the browser driver(s) and version(s) where you see this issue?

Chrome Driver 121.0.6167.85

Are you using Selenium Grid?

No

github-actions[bot] commented 10 months ago

@anteph, thank you for creating this issue. We will troubleshoot it as soon as we can.


Info for maintainers

Triage this issue by using labels.

If information is missing, add a helpful comment and then I-issue-template label.

If the issue is a question, add the I-question label.

If the issue is valid but there is no time to troubleshoot it, consider adding the help wanted label.

If the issue requires changes or fixes from an external project (e.g., ChromeDriver, GeckoDriver, MSEdgeDriver, W3C), add the applicable G-* label, and it will provide the correct link and auto-close the issue.

After troubleshooting the issue, please add the R-awaiting answer label.

Thank you!

titusfortner commented 10 months ago

Logs (or even a stack trace) would help provide context for what is happening with this exception occurs.

Without more info, I'm not sure how this would happen. Selenium ensures that each driver constructor goes through Selenium Manager which makes sure the right driver is available for use. I'm not sure how Selenium Manager on the one hand tells the code which driver to use at the same time that it isn't ready for use.

anteph commented 10 months ago

Here is the stack trace:

OSError: [Errno 26] Text file busy: '/home/app/.cache/selenium/chromedriver/linux64/121.0.6167.85/chromedriver'
File "my_app/handler.py" line 40, in handle
     driver = webdriver.Chrome(options=webdriver_options)
File "selenium/webdriver/chrome/webdriver.py", line 45, in __init__
    super().__init__(
File "selenium/webdriver/chromium/webdriver.py", line 53, in __init__
    self.service.start()
File "selenium/webdriver/common/service.py, line 98, in start
    self._start_process(self._path)
File "selenium/webdriver/common/service.py, line 204, in _start_process
    self.process = subprocess.Popen(
File "subprocess.py", line 971, in __init__
    self._execute_child(args, executable, preexec_fn, close_fds,
File "subprocess.py", line 1863, in _execute_child
    raise child_exception_type(errno_num, err_msgm err_filename)

Unfortunately I don't have any logs as I didn't find a way yet to reproduce it, if I manage to I'll try to get some logs.

Please let me know if I can provide any additional context that may be helpul.

Thanks!

titusfortner commented 10 months ago

@bonigarcia this error seems to indicate that when running in parallel :

Any ideas how this might happen? Would ttl affect it somehow?

masato-hi commented 10 months ago

In Ruby, we were able to pre-download the webdriver by running the following code before running Selenium in multiple threads.

require 'selenium/webdriver'

options = Selenium::WebDriver::Chrome::Options.new
Selenium::WebDriver::SeleniumManager.driver_path(options)

The code to do the same thing in Python probably looks something like this.

from selenium.webdriver.common.selenium_manager import SeleniumManager
from selenium.webdriver.chrome.options import Options

options = Options()
SeleniumManager().driver_location(options)

I would be happy if the pre-download procedure is documented.

bonigarcia commented 10 months ago

Selenium Manager never redownloads the driver once it is in the cache. An exception will happen if the cache is refreshed each time, which I suppose is very unlike (e.g., SE_CLEAR_CACHE=true).

So, the first thing will be trying to understand why this issue happens. I don't see clearly that this is related to the Rust side. First, when a similar error occurs in Rust, the error message is error=26, Text file busy, which is not exactly the same as the reported.

Second, I am not familiar with the Python implementation, but the error traces seem to indicate it happens when the chromedriver process is started. If that is true, I believe it means that Selenium Manager has finished its job.

Regarding the TTL, it has nothing to do with it (see doc).

I looked for this issue on the Web, and it has been reported other times, e.g.:

https://stackoverflow.com/questions/62072978/chromedriver-text-file-busy https://stackoverflow.com/questions/19036016/oserror-errno-26-text-file-busy-on-vagrant-with-synced-folder

anteph commented 10 months ago

Selenium Manager never redownloads the driver once it is in the cache. An exception will happen if the cache is refreshed each time, which I suppose is very unlike (e.g., SE_CLEAR_CACHE=true).

From what I was able to determine, this issue is non-deterministic but when it happens, it happens when a new pod is launched. This means that effectively the cache is empty. I've tested myself launching a new instance of the container and I can see that the cache folder is empty. Then I instantiate the web driver for the first time and the cache gets filled.

In my use case, since I have different processes being launched in the same pod, isn't it possible that two of them check the cache in parallel, see that the folder is empty, and then try to write to the cache at the same time?

Since these are different processes and not threads, I think any underlying thread based synchronization mechanisms wouldn't work (assuming something such as a mutex could potentially be in use to prevent such a race condition)?

So, the first thing will be trying to understand why this issue happens. I don't see clearly that this is related to the Rust side. First, when a similar error occurs in Rust, the error message is error=26, Text file busy, which is not exactly the same as the reported.

I get the Text file busy (error 26). The error message string is not exactly the same, but maybe it is something on the python side wrapping an OS error differently? I'm not very familiar with how Python bubbles up any error returned by a subprocess call, but at least the OS error seems to be the same.

Not sure if this would change anything, but I'm pre-installing the chrome driver in the image in usr/bin/chromedriver.

bonigarcia commented 10 months ago

In my use case, since I have different processes being launched in the same pod, isn't it possible that two of them check the cache in parallel, see that the folder is empty, and then try to write to the cache at the same time?

Yes, I believe that is possible. I'm trying to reproduce it on my side by using a concurrent creation of Selenium sessions (using Selenium 4.17 and an empty cache), but I still never experienced it.

It would help if this issue could be reproduced somehow. Also, it will help to gather the complete logs when the error happens. @anteph This is the doc to gather all logs; thanks: https://www.selenium.dev/documentation/webdriver/troubleshooting/logging/

titusfortner commented 10 months ago

I would be happy if the pre-download procedure is documented.

@masato-hi the procedure is going to change in the next release, and I will document that.

anteph commented 10 months ago

@bonigarcia I've enabled the logs but haven't been able to reproduce it again (the incidence of this was quite low). I'll share if I manage to reproduce it again, but please feel free to close the ticket whenever you see fit.

As a workaround I've simply added a retry in my application.

Thanks!

igorpek commented 9 months ago

Hi šŸ‘‹

I'm running automated tests with pytest in parallel on CircleCI, and I'm getting the same error occasionally. Selenium version is 4.17.2

I was able to reproduce the issue with logs enabled, hope it helps:

if issubclass(child_exception_type, OSError) and hex_errno:
                errno_num = int(hex_errno, 16)
                child_exec_never_called = (err_msg == "noexec")
                if child_exec_never_called:
                    err_msg = ""
                    # The error must be from chdir(cwd).
                    err_filename = cwd
                else:
                    err_filename = orig_executable
                if errno_num != 0:
                    err_msg = os.strerror(errno_num)
>               raise child_exception_type(errno_num, err_msg, err_filename)
E               OSError: [Errno 26] Text file busy: '/home/circleci/.cache/selenium/chromedriver/linux64/116.0.5845.96/chromedriver'

../.pyenv/versions/3.9.18/lib/python3.9/subprocess.py:1837: OSError

------------------------------ Captured log setup ------------------------------
DEBUG    selenium.webdriver.common.selenium_manager:selenium_manager.py:75 Selenium Manager binary found at: /home/circleci/project/venv/lib/python3.9/site-packages/selenium/webdriver/common/linux/selenium-manager
DEBUG    selenium.webdriver.common.selenium_manager:selenium_manager.py:134 Executing process: /home/circleci/project/venv/lib/python3.9/site-packages/selenium/webdriver/common/linux/selenium-manager --browser chrome --debug --language-binding python --output json
DEBUG    selenium.webdriver.common.selenium_manager:selenium_manager.py:151 chromedriver not found in PATH
DEBUG    selenium.webdriver.common.selenium_manager:selenium_manager.py:151 chrome detected at /usr/bin/google-chrome
DEBUG    selenium.webdriver.common.selenium_manager:selenium_manager.py:151 Running command: /usr/bin/google-chrome --version
DEBUG    selenium.webdriver.common.selenium_manager:selenium_manager.py:151 Output: "Google Chrome 116.0.5845.110 "
DEBUG    selenium.webdriver.common.selenium_manager:selenium_manager.py:151 Detected browser: chrome 116.0.5845.110
DEBUG    selenium.webdriver.common.selenium_manager:selenium_manager.py:151 Discovering versions from https://googlechromelabs.github.io/chrome-for-testing/known-good-versions-with-downloads.json
DEBUG    selenium.webdriver.common.selenium_manager:selenium_manager.py:151 Required driver: chromedriver 116.0.5845.96
DEBUG    selenium.webdriver.common.selenium_manager:selenium_manager.py:151 Downloading chromedriver 116.0.5845.96 from https://storage.googleapis.com/chrome-for-testing-public/116.0.5845.96/linux64/chromedriver-linux64.zip
DEBUG    selenium.webdriver.common.selenium_manager:selenium_manager.py:151 Driver path: /home/circleci/.cache/selenium/chromedriver/linux64/116.0.5845.96/chromedriver
DEBUG    selenium.webdriver.common.selenium_manager:selenium_manager.py:151 Browser path: /usr/bin/google-chrome
DEBUG    selenium.webdriver.common.selenium_manager:selenium_manager.py:110 Using driver at: /home/circleci/.cache/selenium/chromedriver/linux64/116.0.5845.96/chromedriver
bonigarcia commented 9 months ago

Thanks for the logs, @igorpek. What I can see there is that Selenium-Manager works as expected: it downloads the driver and reports its path to the bindings:

DEBUG    selenium.webdriver.common.selenium_manager:selenium_manager.py:110 Using driver at: /home/circleci/.cache/selenium/chromedriver/linux64/116.0.5845.96/chromedriver

I'm not sure, but maybe the problem happens when different processes try to write the same file in parallel. In that case, I don't know the best solution.

In Slack, Simon told me about using a lock mechanism based on a known port called SocketLock:

We used to use a SocketLock to allow cross-process, cross-language access to resources. If you had the lock, you could do the work.

But I believe that solution can bring another problem. What if another application uses the chosen port (7055 in the previous implementation)?

@titusfortner: Any idea?

bonigarcia commented 9 months ago

An alternative idea would be using a file lock. There are some Rust crates about that, but I would need to look into that.

nekdan commented 8 months ago

Hello, I have the same problem. It started appearing after we decided to stop downloading a certain version of the chrome driver and started using Selenium Manager. First we download the latest version of chrome, the rest is done by the Selenium manager. I'm running automated tests with pytest in parallel on pytest-xdist and use Selenium 4.18.1. I see that there are warnings related to access, Iā€™m not sure that they affect this error, since sometimes tests pass without errors, but these warnings are always there. I'm attaching the logs, hope it helps:

18:11:51              if issubclass(child_exception_type, OSError) and hex_errno:
18:11:51                  errno_num = int(hex_errno, 16)
18:11:51                  child_exec_never_called = (err_msg == "noexec")
18:11:51                  if child_exec_never_called:
18:11:51                      err_msg = ""
18:11:51                      # The error must be from chdir(cwd).
18:11:51                      err_filename = cwd
18:11:51                  else:
18:11:51                      err_filename = orig_executable
18:11:51                  if errno_num != 0:
18:11:51                      err_msg = os.strerror(errno_num)
18:11:51  >               raise child_exception_type(errno_num, err_msg, err_filename)
18:11:51  E               OSError: [Errno 26] Text file busy: 'chromedriver/linux64/122.0.6261.94/chromedriver'
18:11:51  
18:11:51  /usr/local/lib/python3.11/subprocess.py:1901: OSError
18:11:51  ------------------------------ Captured log setup ------------------------------
18:11:51  DEBUG    selenium.webdriver.common.selenium_manager:selenium_manager.py:75 Selenium Manager binary found at: /usr/local/lib/python3.11/site-packages/selenium/webdriver/common/linux/selenium-manager
18:11:51  DEBUG    selenium.webdriver.common.selenium_manager:selenium_manager.py:134 Executing process: /usr/local/lib/python3.11/site-packages/selenium/webdriver/common/linux/selenium-manager --browser chrome --debug --language-binding python --output json
18:11:51  WARNING  selenium.webdriver.common.selenium_manager:selenium_manager.py:149 Cache folder (/.cache/selenium) cannot be created: Permission denied (os error 13)
18:11:51  DEBUG    selenium.webdriver.common.selenium_manager:selenium_manager.py:151 Sending stats to Plausible: Props { browser: "chrome", browser_version: "", os: "linux", arch: "x86_64", lang: "python", selenium_version: "4.18" }
18:11:51  DEBUG    selenium.webdriver.common.selenium_manager:selenium_manager.py:151 chromedriver not found in PATH
18:11:51  DEBUG    selenium.webdriver.common.selenium_manager:selenium_manager.py:151 chrome detected at /usr/bin/google-chrome
18:11:51  DEBUG    selenium.webdriver.common.selenium_manager:selenium_manager.py:151 Running command: /usr/bin/google-chrome --version
18:11:51  DEBUG    selenium.webdriver.common.selenium_manager:selenium_manager.py:151 Output: "Google Chrome 122.0.6261.94 "
18:11:51  DEBUG    selenium.webdriver.common.selenium_manager:selenium_manager.py:151 Detected browser: chrome 122.0.6261.94
18:11:51  WARNING  selenium.webdriver.common.selenium_manager:selenium_manager.py:149 Cache folder (/.cache/selenium) cannot be created: Permission denied (os error 13)
18:11:51  DEBUG    selenium.webdriver.common.selenium_manager:selenium_manager.py:151 Discovering versions from https://googlechromelabs.github.io/chrome-for-testing/known-good-versions-with-downloads.json
18:11:51  DEBUG    selenium.webdriver.common.selenium_manager:selenium_manager.py:151 Required driver: chromedriver 122.0.6261.94
18:11:51  WARNING  selenium.webdriver.common.selenium_manager:selenium_manager.py:149 Cache folder (/.cache/selenium) cannot be created: Permission denied (os error 13)
18:11:51  DEBUG    selenium.webdriver.common.selenium_manager:selenium_manager.py:151 Downloading chromedriver 122.0.6261.94 from https://storage.googleapis.com/chrome-for-testing-public/122.0.6261.94/linux64/chromedriver-linux64.zip
18:11:51  WARNING  selenium.webdriver.common.selenium_manager:selenium_manager.py:149 Cache folder (/.cache/selenium) cannot be created: Permission denied (os error 13)
18:11:51  DEBUG    selenium.webdriver.common.selenium_manager:selenium_manager.py:151 Driver path: chromedriver/linux64/122.0.6261.94/chromedriver
18:11:51  DEBUG    selenium.webdriver.common.selenium_manager:selenium_manager.py:151 Browser path: /usr/bin/google-chrome
18:11:51  DEBUG    selenium.webdriver.common.selenium_manager:selenium_manager.py:110 Using driver at: chromedriver/linux64/122.0.6261.94/chromedriver
titusfortner commented 8 months ago

But I believe that solution can bring another problem. What if another application uses the chosen port (7055 in the previous implementation)?

So we think this is Selenium Bindings running in different processes each running an instance of selenium manager which causes a race condition on which process can download the driver to that location...

I'm assuming that rust downloads it and then sets it to executable or something. Is it possible for the rust code to catch that error and then wait for the file at that location to be ready and then return it?

Alternately we can put that logic in the bindings....

p0deje commented 6 months ago

I tried reproducing this by running 100 SM processes and I don't see any errors. Is this still a problem?

seq 100 | parallel -N0 bazel-bin/rust/selenium-manager --driver chromedriver --debug
igorpek commented 6 months ago

I'm still able to reproduce the issue by running 8 tests in parallel via pytest. 2 out of 8 failed with the following error:

...
            try:
                exception_name, hex_errno, err_msg = (
                        errpipe_data.split(b':', 2))
                # The encoding here should match the encoding
                # written in by the subprocess implementations
                # like _posixsubprocess
                err_msg = err_msg.decode()
            except ValueError:
                exception_name = b'SubprocessError'
                hex_errno = b'0'
                err_msg = 'Bad exception data from child: {!r}'.format(
                              bytes(errpipe_data))
            child_exception_type = getattr(
                    builtins, exception_name.decode('ascii'),
                    SubprocessError)
            if issubclass(child_exception_type, OSError) and hex_errno:
                errno_num = int(hex_errno, 16)
                child_exec_never_called = (err_msg == "noexec")
                if child_exec_never_called:
                    err_msg = ""
                    # The error must be from chdir(cwd).
                    err_filename = cwd
                else:
                    err_filename = orig_executable
                if errno_num != 0:
                    err_msg = os.strerror(errno_num)
>               raise child_exception_type(errno_num, err_msg, err_filename)
E               OSError: [Errno 26] Text file busy: '/home/circleci/.cache/selenium/chromedriver/linux64/123.0.6312.122/chromedriver'

../.pyenv/versions/3.9.19/lib/python3.9/subprocess.py:1837: OSError

------------------------------ Captured log setup ------------------------------
DEBUG    selenium.webdriver.common.selenium_manager:selenium_manager.py:75 Selenium Manager binary found at: /home/circleci/project/venv/lib/python3.9/site-packages/selenium/webdriver/common/linux/selenium-manager
DEBUG    selenium.webdriver.common.selenium_manager:selenium_manager.py:134 Executing process: /home/circleci/project/venv/lib/python3.9/site-packages/selenium/webdriver/common/linux/selenium-manager --browser chrome --debug --language-binding python --output json
DEBUG    selenium.webdriver.common.selenium_manager:selenium_manager.py:151 chromedriver not found in PATH
DEBUG    selenium.webdriver.common.selenium_manager:selenium_manager.py:151 chrome detected at /usr/bin/google-chrome
DEBUG    selenium.webdriver.common.selenium_manager:selenium_manager.py:151 Running command: /usr/bin/google-chrome --version
DEBUG    selenium.webdriver.common.selenium_manager:selenium_manager.py:151 Output: "Google Chrome 123.0.6312.58 "
DEBUG    selenium.webdriver.common.selenium_manager:selenium_manager.py:151 Detected browser: chrome 123.0.6312.58
DEBUG    selenium.webdriver.common.selenium_manager:selenium_manager.py:151 Discovering versions from https://googlechromelabs.github.io/chrome-for-testing/known-good-versions-with-downloads.json
DEBUG    selenium.webdriver.common.selenium_manager:selenium_manager.py:151 Required driver: chromedriver 123.0.6312.122
DEBUG    selenium.webdriver.common.selenium_manager:selenium_manager.py:151 Downloading chromedriver 123.0.6312.122 from https://storage.googleapis.com/chrome-for-testing-public/123.0.6312.122/linux64/chromedriver-linux64.zip
DEBUG    selenium.webdriver.common.selenium_manager:selenium_manager.py:151 Driver path: /home/circleci/.cache/selenium/chromedriver/linux64/123.0.6312.122/chromedriver
DEBUG    selenium.webdriver.common.selenium_manager:selenium_manager.py:151 Browser path: /usr/bin/google-chrome
DEBUG    selenium.webdriver.common.selenium_manager:selenium_manager.py:110 Using driver at: /home/circleci/.cache/selenium/chromedriver/linux64/123.0.6312.122/chromedriver
p0deje commented 6 months ago

@igorpek Do you only see this on CircleCI or locally as well?

igorpek commented 6 months ago

@p0deje it seems I cannot reproduce it locally atm.

p0deje commented 6 months ago

As discussed with @bonigarcia, he'll try implementing a file-locking mechanism and then we'll share the preview build here so you can test it. // cc @titusfortner

mgrissa94 commented 2 months ago

Hello, I am facing the same issue using Robot Framework with Selenium on Gitlab pipeline, The issue exist when I run tests in Parallel using Pabot, the setup of opening the browser is failling Parent suite setup failed: OSError: [Errno 26] Text file busy: '/root/.cache/selenium/chromedriver/linux64/124.0.6367.207/chromedriver'

Create Chrome browser [Arguments] ${BROWSER} ${URL}= readProperty.get_main_url ${ENVIRONMENT} ${chrome_options}= Evaluate sys.modules['selenium.webdriver'].ChromeOptions() sys, selenium.webdriver Call Method ${chrome_options} add_argument --no-sandbox Call Method ${chrome_options} add_argument --headless Call Method ${chrome_options} add_argument --disable-gpu Call Method ${chrome_options} add_argument --disable-infobars Call Method ${chrome_options} add_argument --disable-extensions Call Method ${chrome_options} add_argument --disable-dev-shm-usage Open Browser ${URL} browser=${BROWSER} options=${chrome_options} Set Global Variable ${URL} RETURN ${URL} Any idea how to fix this issue ?

ronaldiansana commented 2 months ago

Same as @mgrissa94, facing this issue right now in Robot Framework when running test in Parallel using Pabot. robotframework==6.1.1 robotframework-seleniumlibrary==6.3.0 selenium==4.15.2