omkarcloud / google-maps-scraper

👋 HOLA 👋 HOLA 👋 HOLA ! ENJOY OUR GOOGLE MAPS SCRAPER 🚀 TO EFFORTLESSLY EXTRACT DATA SUCH AS NAMES, ADDRESSES, PHONE NUMBERS, REVIEWS, WEBSITES, AND RATINGS FROM GOOGLE MAPS WITH EASE! 🤖
https://www.omkar.cloud/
MIT License
1.11k stars 266 forks source link

OSError: [Errno 24] Too many open files #28

Closed L0CKZ0R closed 1 year ago

L0CKZ0R commented 1 year ago

If there is a large number of queries eventually the program cannot support all the open files. No all.csv was generated. Probably best to save and close the files when done.

Traceback (most recent call last): File "/Users//Library/Python/3.10/lib/python/site-packages/bose/base_task.py", line 185, in run_task result = self.run(driver, data) File "/Users//google-maps-scraper/src/scrape_google_maps_links_task.py", line 261, in run result = self.parallel( File "/Users//Library/Python/3.10/lib/python/site-packages/bose/base_task.py", line 90, in parallel result = (Parallel(n_jobs=n, backend="threading")(delayed(run)(l) for l in data_list)) File "/Users//Library/Python/3.10/lib/python/site-packages/joblib/parallel.py", line 1098, in __call__ self.retrieve() File "/Users//Library/Python/3.10/lib/python/site-packages/joblib/parallel.py", line 975, in retrieve self._output.extend(job.get(timeout=self.timeout)) File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/multiprocessing/pool.py", line 771, in get raise self._value File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/multiprocessing/pool.py", line 125, in worker result = (True, func(*args, **kwds)) File "/Users//Library/Python/3.10/lib/python/site-packages/joblib/_parallel_backends.py", line 620, in __call__ return self.func(*args, **kwargs) File "/Users//Library/Python/3.10/lib/python/site-packages/joblib/parallel.py", line 288, in __call__ return [func(*args, **kwargs) File "/Users//Library/Python/3.10/lib/python/site-packages/joblib/parallel.py", line 288, in <listcomp> return [func(*args, **kwargs) File "/Users//Library/Python/3.10/lib/python/site-packages/bose/base_task.py", line 81, in run driver = self.create_driver(config) File "/Users//Library/Python/3.10/lib/python/site-packages/bose/base_task.py", line 73, in create_driver driver = create_driver(config) File "/Users//Library/Python/3.10/lib/python/site-packages/bose/create_driver.py", line 283, in create_driver driver = retry_if_is_error( File "/Users//Library/Python/3.10/lib/python/site-packages/bose/utils.py", line 88, in retry_if_is_error raise e File "/Users//Library/Python/3.10/lib/python/site-packages/bose/utils.py", line 81, in retry_if_is_error created_result = func() File "/Users//Library/Python/3.10/lib/python/site-packages/bose/create_driver.py", line 272, in run driver = BoseDriver( File "/Users//Library/Python/3.10/lib/python/site-packages/selenium/webdriver/chrome/webdriver.py", line 69, in __init__ super().__init__(DesiredCapabilities.CHROME['browserName'], "goog", File "/Users//Library/Python/3.10/lib/python/site-packages/selenium/webdriver/chromium/webdriver.py", line 89, in __init__ self.service.start() File "/Users//Library/Python/3.10/lib/python/site-packages/selenium/webdriver/common/service.py", line 71, in start self.process = subprocess.Popen(cmd, env=self.env, File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/subprocess.py", line 832, in __init__ errread, errwrite) = self._get_handles(stdin, stdout, stderr) File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/subprocess.py", line 1594, in _get_handles p2cread, p2cwrite = os.pipe()

OSError: [Errno 24] Too many open files

Chetan11-dev commented 1 year ago

Hello Lock, Thank you for contacting me. Regrettably due to my tight schedule, I'm unable to assist with that.

However, I encourage you to hire a freelancer specializing in webscraping/selenium/python from upwork to fix your problem.

Thank you for your understanding,

Chetan11-dev commented 1 year ago

How to reproduce it Lock?

treefy commented 9 months ago

I have the same problem, when there are some queries in the array, the script crashes: OSError: [Errno 24] Too many open files

I guess you should close the files opened in each query request, seems that these files are not closed

thank you!

Chetan11-dev commented 9 months ago

Please share error_log folder

treefy commented 9 months ago

Exception in thread Thread-4496 (_worker): joblib.externals.loky.process_executor.RemoteTraceback: """ Traceback (most recent call last): File "/usr/lib/python3/dist-packages/urllib3/util/ssl.py", line 402, in ssl_wrap_socket OSError: [Errno 24] Too many open files

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 700, in urlopen File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 383, in _make_request File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 1017, in _validateconn File "/usr/lib/python3/dist-packages/urllib3/connection.py", line 411, in connect File "/usr/lib/python3/dist-packages/urllib3/util/ssl.py", line 404, in ssl_wrap_socket urllib3.exceptions.SSLError: [Errno 24] Too many open files

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/usr/lib/python3/dist-packages/requests/adapters.py", line 439, in send File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 756, in urlopen File "/usr/lib/python3/dist-packages/urllib3/util/retry.py", line 574, in increment urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='www.google.com', port=443): Max retries exceeded with url: /maps/place/**%C3%A1+-+**/data=!!1s0x12a04***3d40.415526!!16s%2Fg%2F11xj79_70!19sChIJh7B_QbxyitjHgc?authuser=0&hl=es&rclk=1 (Caused by SSLError(OSError(24, 'Too many open files')))

During handling of the above exception, another exception occurred.

I've added some **** here to mask the URLS