umihico / docker-selenium-lambda

The simplest demo of chrome automation by python and selenium in AWS Lambda
MIT License
493 stars 118 forks source link

[ERROR] OSError: [Errno 28] No space left on device: '/tmp/tmp_ovipfqh' #219

Open drjimmyjiang opened 4 months ago

drjimmyjiang commented 4 months ago

I'm getting the following error message. It doesn't happen all the time. It's never happened before until today. Any ideas would be much appreciated. Do I need to change anything in main.py?

[ERROR] OSError: [Errno 28] No space left on device: '/tmp/tmp_ovipfqh' Traceback (most recent call last): File "/var/task/bot.py", line 31, in handler options.add_argument(f"--user-data-dir={mkdtemp()}") File "/var/lang/lib/python3.12/tempfile.py", line 368, in mkdtemp _os.mkdir(file, 0o700)

umihico commented 4 months ago

@drjimmyjiang

It seems like you might be running into this function frequently. If so, it's possible that previous executions have filled up the space, causing the current execution to fail. The ~/tmp` directory can indeed be shared between multiple invocations.

One solution is to clear the /tmp directory when your Lambda function starts, using shutil.rmtree("/tmp") for example. However, please be cautious with this approach as it will remove all files in the /tmp directory, which might include files from other processes.

Alternatively, you can ensure that the temporary directory is cleaned up after your Lambda function finishes executing. Here's an example using the tempfile module with a context manager:

import tempfile
from selenium import webdriver

# Use the tempfile module to create a temporary directory
with tempfile.TemporaryDirectory() as temp_dir:
    options = webdriver.ChromeOptions()
    options.add_argument(f"--user-data-dir={temp_dir}")

I'll keep this issue open for a little while in case someone else encounters the same problem.

drjimmyjiang commented 4 months ago

@drjimmyjiang

It seems like you might be running into this function frequently. If so, it's possible that previous executions have filled up the space, causing the current execution to fail. The ~/tmp` directory can indeed be shared between multiple invocations.

One solution is to clear the /tmp directory when your Lambda function starts, using shutil.rmtree("/tmp") for example. However, please be cautious with this approach as it will remove all files in the /tmp directory, which might include files from other processes.

Alternatively, you can ensure that the temporary directory is cleaned up after your Lambda function finishes executing. Here's an example using the tempfile module with a context manager:

import tempfile
from selenium import webdriver

# Use the tempfile module to create a temporary directory
with tempfile.TemporaryDirectory() as temp_dir:
    options = webdriver.ChromeOptions()
    options.add_argument(f"--user-data-dir={temp_dir}")

I'll keep this issue open for a little while in case someone else encounters the same problem.

Thank you so much for your response umihico. I do run this function very frequently with multiple concurrent invocations. Ideally, I'd like each invocation to be independent of one another and not share any files in the ~/temp directory so that it can handle a high volume of concurrent invocations.

Is it possible to omit the use of tempfile altogether? Is so, what are the drawbacks?

Does it really matter whether the /tmp directory is cleared before or after executing the Lambda function if there are shared files from other concurrent invocations? (Wouldn't it cause problems either way?)

If using the tempfile module to create a temporary directory is the best solution for my use case, may I ask how I would modify the existing main.py file and where I would insert the code snippet? Thank you for your help.

umihico commented 4 months ago
+ import shutil

def handler(event=None, context=None):
+    shutil.rmtree("/tmp")
    options = webdriver.ChromeOptions()
    service = webdriver.ChromeService("/opt/chromedriver")

Maybe like this? I hope this works.

drjimmyjiang commented 4 months ago
+ import shutil

def handler(event=None, context=None):
+    shutil.rmtree("/tmp")
    options = webdriver.ChromeOptions()
    service = webdriver.ChromeService("/opt/chromedriver")

Maybe like this? I hope this works.

Thank you so much, I'll give it a try.

alejlatorre commented 3 months ago

I tried to remove using rmtree and had the following error:

  File "/var/task/src/functions/scraper_engine/scraper.py", line 64, in config_webdriver_with_retry
    return webdriver.Chrome(
           ^^^^^^^^^^^^^^^^^
  File "/var/lang/lib/python3.11/site-packages/selenium/webdriver/chrome/webdriver.py", line 45, in __init__
    super().__init__(
  File "/var/lang/lib/python3.11/site-packages/selenium/webdriver/chromium/webdriver.py", line 61, in __init__
    super().__init__(command_executor=executor, options=options)
  File "/var/lang/lib/python3.11/site-packages/selenium/webdriver/remote/webdriver.py", line 208, in __init__
    self.start_session(capabilities)
  File "/var/lang/lib/python3.11/site-packages/selenium/webdriver/remote/webdriver.py", line 292, in start_session
    response = self.execute(Command.NEW_SESSION, caps)["value"]
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/var/lang/lib/python3.11/site-packages/selenium/webdriver/remote/webdriver.py", line 347, in execute
    self.error_handler.check_response(response)
  File "/var/lang/lib/python3.11/site-packages/selenium/webdriver/remote/errorhandler.py", line 229, in check_response

In order to remove the content but not the folder I created the following function inside utils.py:

import os
import shutil

def clear_directory_contents(dir_path):
    for item in os.listdir(dir_path):
        item_path = os.path.join(dir_path, item)
        if os.path.isdir(item_path):
            shutil.rmtree(item_path)
        else:
            os.remove(item_path)

Also... @drjimmyjiang check how big is the tmp files/folders that your code is generating, I needed to increase my ephemeral storage through serverless.yml:

ephemeralStorageSize: 2048