adieuadieu / serverless-chrome

🌐 Run headless Chrome/Chromium on AWS Lambda
MIT License
2.86k stars 281 forks source link

chrome crash on aws lambda (running python code) #185

Closed andreabisello closed 3 years ago

andreabisello commented 5 years ago

hi, thanks for this awesome project. i moving on python because i need to use https://github.com/lightbody/browsermob-proxy . in a python virtual environment i locally installed selenium and i locally move chromedriver and you "compiled for lambda" chromium-driver (headless-chromium). i set up code in order to use the chromedriver binary and the headless-browser binary of the folder running the code. everything works well in ubuntu 18.04.

image

on aws lambda this will not work

this is the error

Message: unknown error: Chrome failed to start: exited abnormally
(unknown error: DevToolsActivePort file doesn't exist)
(The process started from chrome location /var/task/chromium-browser is no longer running, so ChromeDriver is assuming that Chrome has crashed.)
(Driver info: chromedriver=2.45.615279 (12b89733300bd268cff3b78fc76cb8f3a7cc44e5),platform=Linux 4.14.77-70.59.amzn1.x86_64 x86_64)
: WebDriverException
Traceback (most recent call last):
File "/var/task/scraper.py", line 16, in handle
driver = webdriver.Chrome(executable_path=os.path.abspath(".")+os.path.sep+'chromedriver', chrome_options=chrome_options)
File "/var/task/selenium/webdriver/chrome/webdriver.py", line 81, in __init__
desired_capabilities=desired_capabilities)
File "/var/task/selenium/webdriver/remote/webdriver.py", line 157, in __init__
self.start_session(capabilities, browser_profile)
File "/var/task/selenium/webdriver/remote/webdriver.py", line 252, in start_session
response = self.execute(Command.NEW_SESSION, parameters)
File "/var/task/selenium/webdriver/remote/webdriver.py", line 321, in execute
self.error_handler.check_response(response)
File "/var/task/selenium/webdriver/remote/errorhandler.py", line 242, in check_response
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.WebDriverException: Message: unknown error: Chrome failed to start: exited abnormally
(unknown error: DevToolsActivePort file doesn't exist)
(The process started from chrome location /var/task/chromium-browser is no longer running, so ChromeDriver is assuming that Chrome has crashed.)
(Driver info: chromedriver=2.45.615279 (12b89733300bd268cff3b78fc76cb8f3a7cc44e5),platform=Linux 4.14.77-70.59.amzn1.x86_64 x86_64)

any suggestion?

thanks

p.s i can move everything to https://www.npmjs.com/package/browsermob-proxy but i like to use python (and i think the problem is not python related).

juanjosegzl commented 5 years ago

Try adding options.add_argument('--disable-dev-shm-usage')

andreabisello commented 5 years ago

@jjwizeline already present

ghost commented 5 years ago

you need to make sure all the versions are compatible. I've got it working using:

chromedriver 2.41 headless-chromium-68.0.3440.84 selenium==3.141.0

The latest binaries on the release page don't seem to respect the'--disable-dev-shm-usage' flag

set permissions on /dev/shm to readonly and then try and run locally to test before pushing to lambda.

Lastly, if you having different problems, set a logger in selenium driver:

driver = webdriver.Chrome(
        CHROMEDRIVER_PATH,
        chrome_options=options,
        service_log_path='/tmp/chromedriver.log'
    )

then just do a try/except all and read the log in the lambda if it fails:

try:
    your_headless_chrome_function()
except:
        with open('/tmp/chromedriver.log', 'r') as logfile:
            data = logfile.readlines()
        return data
kremen commented 5 years ago

@x00x70, thanks a lot! I use Ruby and my requests to chrome were timed out on AWS Lambda ({“message”: “Endpoint request timed out”}), when I was using the latest binaries from the release page (v1.0.0-55).

And it really works with: chromedriver 2.41 headless-chromium-68.0.3440.84 selenium==3.141.0

Cool tip to set permissions on /dev/shm to readonly and test it locally!

oncleguigs commented 5 years ago

@x00x70, thanks a lot! I use Ruby and my requests to chrome were timed out on AWS Lambda ({“message”: “Endpoint request timed out”}), when I was using the latest binaries from the release page (v1.0.0-55).

And it really works with: chromedriver 2.41 headless-chromium-68.0.3440.84 selenium==3.141.0

Cool tip to set permissions on /dev/shm to readonly and test it locally!

Thank you, working great with these versions!

HassyMasa commented 5 years ago

@x00x70, thanks a lot! I use Ruby and my requests to chrome were timed out on AWS Lambda ({“message”: “Endpoint request timed out”}), when I was using the latest binaries from the release page (v1.0.0-55).

And it really works with: chromedriver 2.41 headless-chromium-68.0.3440.84 selenium==3.141.0

Cool tip to set permissions on /dev/shm to readonly and test it locally!

"errorMessage": "2019-mm-ddThh:mm:ss.xxxZ XXXX-XXXX-XXXX-XXXX-XXXX Task timed out after xx.xx seconds

kremen commented 5 years ago

@HassyMasa, does it work locally (with readonly permissions on /dev/shm)?

HassyMasa commented 5 years ago

@HassyMasa, does it work locally (with readonly permissions on /dev/shm)?

I use this option options.add_argument('--disable-dev-shm-usage')

oncleguigs commented 5 years ago

@x00x70, thanks a lot! I use Ruby and my requests to chrome were timed out on AWS Lambda ({“message”: “Endpoint request timed out”}), when I was using the latest binaries from the release page (v1.0.0-55). And it really works with: chromedriver 2.41 headless-chromium-68.0.3440.84 selenium==3.141.0 Cool tip to set permissions on /dev/shm to readonly and test it locally!

"errorMessage": "2019-mm-ddThh:mm:ss.xxxZ XXXX-XXXX-XXXX-XXXX-XXXX Task timed out after xx.xx seconds

I had the same problem, I temporarily fixed it by increasing memory and timeout. Also in my case I use theses options: chrome_options.add_argument('--headless') chrome_options.add_argument('--disable-dev-shm-usage') chrome_options.add_argument('--disable-extensions') chrome_options.add_argument('--no-sandbox') chrome_options.add_argument('--no-cache') chrome_options.add_argument('--disable-gpu') chrome_options.add_argument('--window-size=1024x768') chrome_options.add_argument('--user-data-dir=/tmp/user-data') chrome_options.add_argument('--hide-scrollbars') chrome_options.add_argument('--enable-logging') chrome_options.add_argument('--log-level=0') chrome_options.add_argument('--v=99') chrome_options.add_argument('--single-process') chrome_options.add_argument('--data-path=/tmp/data-path') chrome_options.add_argument('--ignore-certificate-errors') chrome_options.add_argument('--homedir=/tmp') chrome_options.add_argument('--disk-cache-dir=/tmp/cache-dir') chrome_options.add_argument('user-agent=Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/61.0.3163.100 Safari/537.36')

ghost commented 5 years ago

@HassyMasa, does it work locally (with readonly permissions on /dev/shm)?

I use this option options.add_argument('--disable-dev-shm-usage')

Yes - but sometime's this option does not work depending on the version of the binary you are using. To test whether or not the binary respects this flag, you can run sudo chmod 500 /dev/shm and then test your function. if the binary does not respect the --disable-dev-shm-usage your function will crash and you will need to use a different binary. Remember to reset /dev/shm permissions after you're done.

HassyMasa commented 5 years ago

Yes - but sometime's this option does not work depending on the version of the binary you are using. To test whether or not the binary respects this flag, you can run sudo chmod 500 /dev/shm and then test your function. if the binary does not respect the --disable-dev-shm-usage your function will crash and you will need to use a different binary. Remember to reset /dev/shm permissions after you're done.

The situation has not changed,That's a shame... Thank you.

HassyMasa commented 5 years ago

@x00x70, thanks a lot! I use Ruby and my requests to chrome were timed out on AWS Lambda ({“message”: “Endpoint request timed out”}), when I was using the latest binaries from the release page (v1.0.0-55). And it really works with: chromedriver 2.41 headless-chromium-68.0.3440.84 selenium==3.141.0 Cool tip to set permissions on /dev/shm to readonly and test it locally!

"errorMessage": "2019-mm-ddThh:mm:ss.xxxZ XXXX-XXXX-XXXX-XXXX-XXXX Task timed out after xx.xx seconds

I had the same problem, I temporarily fixed it by increasing memory and timeout. Also in my case I use theses options: chrome_options.add_argument('--headless') chrome_options.add_argument('--disable-dev-shm-usage') chrome_options.add_argument('--disable-extensions') chrome_options.add_argument('--no-sandbox') chrome_options.add_argument('--no-cache') chrome_options.add_argument('--disable-gpu') chrome_options.add_argument('--window-size=1024x768') chrome_options.add_argument('--user-data-dir=/tmp/user-data') chrome_options.add_argument('--hide-scrollbars') chrome_options.add_argument('--enable-logging') chrome_options.add_argument('--log-level=0') chrome_options.add_argument('--v=99') chrome_options.add_argument('--single-process') chrome_options.add_argument('--data-path=/tmp/data-path') chrome_options.add_argument('--ignore-certificate-errors') chrome_options.add_argument('--homedir=/tmp') chrome_options.add_argument('--disk-cache-dir=/tmp/cache-dir') chrome_options.add_argument('user-agent=Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/61.0.3163.100 Safari/537.36')

10sec->30sec The situation has not changed... 30sec->180sec,MEMsize:128Mbyte->256MByte No error! but, it takes too long. Duration: 141092.24 ms Billed Duration: 141100 ms Memory Size: 256 MB Max Memory Used: 207 MB and cannot get response data

driver.get('_sample site_')
return {'code': 200,'body':driver.title}

{"code": 200, "body": ""}
Duration: 9954.62 ms    Billed Duration: 10000 ms   Memory Size: 256 MB Max Memory Used: 245 MB

driver.get('https://www.google.com')
return {'code': 200,'body':driver.title}

{"code": 200, "body": "Google"}
SmileSydney commented 5 years ago

I'm still getting an error on lambci/lambda docker image:

selenium.common.exceptions.WebDriverException: Message: unknown error: Chrome failed to start: exited abnormally
  (unknown error: DevToolsActivePort file doesn't exist)

on the following combinations

Python 3.7
Selenium 3.141
severless-chrome 1.0.0-55
chrome-driver 2.43
simontt commented 2 years ago

@x00x70, thanks a lot! I use Ruby and my requests to chrome were timed out on AWS Lambda ({“message”: “Endpoint request timed out”}), when I was using the latest binaries from the release page (v1.0.0-55).

And it really works with: chromedriver 2.41 headless-chromium-68.0.3440.84 selenium==3.141.0

Cool tip to set permissions on /dev/shm to readonly and test it locally!

Still works!