schickling / chromeless

🖥 Chrome automation made simple. Runs locally or headless on AWS Lambda.
https://chromeless.netlify.com
MIT License
13.24k stars 575 forks source link

Use-Case: Serverless Chrome In Local Mode #176

Open neekolas opened 7 years ago

neekolas commented 7 years ago

Not sure if this is a case you guys have considered, but I don't see any reference to it in the open Issues or Readme. I work with a medium-sized (few million pages a day) crawler system that runs almost entirely in AWS Lambda. I would love to use Chromeless for its great APIs on top of Serverless Chrome, but I don't need all the fancy Websocket stuff of proxy mode since everything is already running inside a Lambda function. Creating a RPC from one Lambda to another seems overkill.

Is it possible to use Chromeless in "local" mode with serverless-chrome acting as the local Chrome? Are there downsides or limitations to this approach? Will this be supported moving forward?

adieuadieu commented 7 years ago

Hi @neekolas yes of course! You're right that we haven't documented this very well, but Chromeless can be used within a Lambda function. The Chromeless Proxy uses the serverless-plugin-chrome package for Serverless. However, you can almost as easily go "vanilla" with the @serverless-chrome/lambda package.

For example (this is untested code I just cobbled together, but should convey the idea):

const launchChrome = require('@serverless-chrome/lambda')
const Chromeless = require('chromeless')

module.exports.handler = function handler (event, context, callback) {
  launchChrome({
    flags: ['--window-size=1280x1696', '--hide-scrollbars'],
  })
    .then((chrome) => {
      // Chrome is now running on localhost:9222

      const chromeless = new Chromeless({
        launchChrome: false,
      })

      chromeless
        .goto('https://www.google.com')
        .type('chromeless', 'input[name="q"]')
        .press(13)
        .wait('#resultStats')
        .evaluate(() => {
          // this will be executed in headless chrome
          const links = [].map.call(document.querySelectorAll('.g h3 a'), a => ({
            title: a.innerText,
            href: a.href,
          }))
          return JSON.stringify(links)
        })
        .then((urls) => {
          chromeless
            .close()
            .then(chrome.kill) // https://github.com/adieuadieu/serverless-chrome/issues/41#issuecomment-317989508
            .then(() => {
              callback(null, urls)
            })
        })
        .catch(callback)
    })
    .catch((error) => {
      // Chrome didn't launch correctly
      callback(error)
    })
}
ryancat commented 7 years ago

Just run into the same problem of running chrome headless locally. The solution by @adieuadieu works. Instead of using @serverless-chrome/lambda, I used chromeLauncher which is suggested by google chrome team. Here is my code sample:

  const chromeLauncher = require('chrome-launcher');
  const Chromeless = require('chromeless').Chromeless;

  chromeLauncher.launch({
    // port: 9222, // Uncomment to force a specific port of your choice.
    chromeFlags: [
      '--window-size=1200,800',
      '--disable-gpu',
      '--headless'
    ]
  })
  .then(function (chrome) {
    console.log('Chrome debuggable on port: ' + chrome.port);
    const chromeless = new Chromeless({
      launchChrome: false
    });
    var url = '[SOME URL FOR TESTING]'
    chromeless.goto(url)
    .then(function () {
      // Test runner script
    })
neekolas commented 7 years ago

Interesting...Sure beats the wrangling and shoehorning I had to do to get nightmare running smoothly in our Docker Cluster and CircleCI. Thanks @adieuadieu @ryancat!

I'll do a bit of poking around on https://github.com/adieuadieu/serverless-chrome/issues/41. I'm more familiar than I'd like to be with the innards of the Lambda execution environment.

neekolas commented 7 years ago

The best workaround for adieuadieu/serverless-chrome#41 I've come up with so far is: https://github.com/neekolas/chromeless-testbed/pull/1. Still reliably fails on the 5th invocation, but it at least gives you 4 invocations before you have to recreate. Will keep digging.

neekolas commented 7 years ago

Persisting a Chrome instance for more than 5 invokes is still giving me trouble...but I was able to get Chromeless working in Alpine Linux without any special docker run flags. Image weighs in at a totally reasonable 350mb uncompressed. https://github.com/neekolas/chromeless-testbed/blob/feature/docker/Dockerfile

mexin commented 6 years ago

I tried @adieuadieu snippet but when deploying using severless it tries to upload the service .zip that is 546Mb! which fails due the size restriction on the lambda. Any one has a tutorial or anything to overcome this issue???

Thanks!

adieuadieu commented 6 years ago

@mexin Make sure you only .zip relevant dependencies. E.g. are you shipping a huge node_modules folder? (with devDependencies?)

saikiranvadlakonda commented 6 years ago
 const chromeLauncher = require('chrome-launcher');
  const Chromeless = require('chromeless').Chromeless;

  chromeLauncher.launch({
    // port: 9222, // Uncomment to force a specific port of your choice.
    chromeFlags: [
      '--window-size=1200,800',
      '--disable-gpu',
      '--headless'
    ]
  })
  .then(function (chrome) {
    console.log('Chrome debuggable on port: ' + chrome.port);
    var port = chrome.port;
    console.log(port);

    const chromeless = new Chromeless({
       //cdp:{host: 'localhost', port: port, secure: false, closeTab: true}, 
      launchChrome: false

    });
    var url = 'https://xyz.com';

    chromeless.goto(url)
    .then(function () {
      // Test runner script

      console.log("opened");
      chromeless.end();
    });
  });

After running above code I'm facing below issue, actually chrome-launcher launched at port number 45417(some random port), and I've created chromeless object with option launchChrome:false, why I'm facing this issue could any one help me out. Thank you

(node:16697) UnhandledPromiseRejectionWarning: Unhandled promise rejection (rejection id: 1): Error: connect ECONNREFUSED 127.0.0.1:9222 (node:16697) [DEP0018] DeprecationWarning: Unhandled promise rejections are deprecated. In the future, promise rejections that are not handled will terminate the Node.js process with a non-zero exit code.

onassar commented 6 years ago

@adieuadieu jumping in here because I didn't want to open a new issue (since it's not really an issue), and seems related (a "what next" kind of question).

I followed the #setup without a problem, but tbh, I'm lost as to what the next step is.

I've set up Lambda functions in the past whereby I route requests through API Gateway to them, but now that it's installed, how do I actually use the service? Any docs you can point me to?