umihico / docker-selenium-lambda

The simplest demo of chrome automation by python and selenium in AWS Lambda
MIT License
535 stars 127 forks source link

Use extension on lambda #179

Closed danielWagnerr closed 1 year ago

danielWagnerr commented 1 year ago

Currently, extensions are not working on AWS lambda.

This is the config of my driver:

def start_driver(self) -> None:
        """Starts selenium driver"""

        logging.info("Starting driver")

        options = Options()

        options.binary_location = "/opt/chrome/chrome"
        options.add_argument("--enable-javascript")
        options.add_argument("--headless=new")
        options.add_argument("--window-size=1280x1696")
        options.add_argument("--no-sandbox")
        options.add_argument("--disable-gpu")
        options.add_argument("--single-process")
        options.add_argument("--disable-dev-shm-usage")
        options.add_argument("--disable-dev-tools")
        options.add_argument("--no-zygote")
        options.add_argument("--disable-blink-features=AutomationControlled")
        options.add_argument(f"--user-data-dir={self.tmp_folder}")
        options.add_argument(f"--data-path={self.tmp_folder}")
        options.add_argument(f"--disk-cache-dir={self.tmp_folder}")

        proxy_plugin_path = (
            os.path.dirname(os.path.dirname(os.path.realpath(__file__))) + "/scraper/plugins/plugin_proxy"
        )
        options.add_argument(f"--load-extension={proxy_plugin_path}/")

        caps = DesiredCapabilities.CHROME
        caps["goog:loggingPrefs"] = {"performance": "ALL"}

        self.driver = uc.Chrome(
            browser_executable_path="/opt/chromedriver",
            options=options,
            desired_capabilities=caps,
        )
        self.driver.delete_all_cookies()
        self.driver.execute_cdp_cmd("Network.setUserAgentOverride", {"userAgent": constants.USER_AGENT})
        self.driver.execute_script("Object.defineProperty(navigator, 'webdriver', {get: () => undefined})")

        logging.info(f"Started driver")

It works well both on AWS lambda and on my local machine (the only difference is the driver location), but the problem is:

I'm trying to use authenticated proxy and because of that I need to use this extension, on my local machine it works fine, but when I test it on the AWS Lambda, the plugin doesn't work, so it doesn't connect to the proxy.

It seems like the plugin is not being loaded, any known fix for that?

umihico commented 1 year ago

on my local machine

It is docker image?

danielWagnerr commented 1 year ago

No, I'm just running it normally on Terminal...

umihico commented 1 year ago

Then, please try to use docker in local first. Maybe AWS Lambda is not related. I guess that you need to add chrome extension in Dockerfile. Please reopen or post another issue when you find some specific issue