Carr1005 / lighthouse-batch-parallel

A module for helping collecting websites' Lighthouse audit data in batches. Get the report data stream in CSV, JS Object or JSON format. Also provide a cli-tool to generate the report file in CSV or JSON format directly.
Apache License 2.0
29 stars 9 forks source link

Impossible to use lighthouse-batch-parallel #9

Open JulienHeiduk opened 4 years ago

JulienHeiduk commented 4 years ago

issues

I try to use the package but I have an issue. Have you already had this kind of issue? I am looking for the solution to fix it.

bryzgaloff commented 3 years ago

Is this fix published? I still have the same issue. Steps to reproduce:

  1. Create a Dockerfile:

    FROM justinribeiro/lighthouse
    ENV PUPPETEER_SKIP_CHROMIUM_DOWNLOAD=1
    USER root
    RUN npm i lighthouse-batch-parallel -g
    USER chrome

    Base image is this one: https://hub.docker.com/r/justinribeiro/lighthouse — it works perfectly for a single page.

  2. docker build -t lighthouse-batch-parallel .

  3. docker run --rm -it --security-opt seccomp=$HOME/seccomp-chrome.json lighthouse-batch-parallel bash

  4. echo -e 'Device,URL\ndesktop,https://tproger.ru\nmobile,https://tproger.ru' > input.csv

  5. lighthouse-batch-parallel input.csv

And nothing works then:

(node:24) UnhandledPromiseRejectionWarning: TypeError: Cannot read property 'disconnect' of undefined
    at /usr/lib/node_modules/lighthouse-batch-parallel/worker.js:81:15
    at processTicksAndRejections (internal/process/task_queues.js:97:5)
(node:24) UnhandledPromiseRejectionWarning: Unhandled promise rejection. This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch(). To terminate the node process on unhandled promise rejection, use the CLI flag `--unhandled-rejections=strict` (see https://nodejs.org/api/cli.html#cli_unhandled_rejections_mode). (rejection id: 1)
(node:24) [DEP0018] DeprecationWarning: Unhandled promise rejections are deprecated. In the future, promise rejections that are not handled will terminate the Node.js process with a non-zero exit code.

UPD: Same issue when trying to install from GitHub. Lighthouse itself runs ok using this command lighthouse --chrome-flags="--headless --disable-gpu" https://tproger.ru.

Carr1005 commented 3 years ago

Hi @bryzgaloff, Sry that I haven't merged the fix commit because it didn't really solve the root problem. The first thought that comes out of my mind is trying it again without skipping the chromium installation.

~ENV PUPPETEER_SKIP_CHROMIUM_DOWNLOAD=1~

You could see that I utilize puppeteer.executablePath() to launch the puppeteer instance: https://github.com/Carr1005/lighthouse-batch-parallel/blob/b3a913f6ebb40ceab310a32f28a40d6899bcf667/worker.js#L39-L41 and from the document of puppeteer :

returns: A path where Puppeteer expects to find the bundled browser. The browser binary might not be there if the download was skipped with PUPPETEER_SKIP_CHROMIUM_DOWNLOAD.

In @JulienHeiduk 's case, he used this module on GCP environment, and after he manually did:

sudo apt-get install chromium

the problem is solved. I didn't know if he had any preset on GCP that causes the omission of chromium installation or he also ran it with the docker container.

Sry that I didn't test this module on GCP or with Docker before, I would appreciate any of your further feedback.


Also, I am not sure about the reason and context for your use of the base image, this module would install the lighthouse module itself: https://github.com/Carr1005/lighthouse-batch-parallel/blob/b3a913f6ebb40ceab310a32f28a40d6899bcf667/package.json#L20-L24

to discuss with your UPD, as I remember, the lighthouse CLI uses the 'chrome-launcher' to find an existing chrome core in your OS. This module with puppeteer would download its own chromium anyway (if you don't set an environment variable to skip it). When I design it, I want to avoid the situation that users don't have any existing chromium on their device. It's a bit complicated for me to think in your scenario. If there is already a workable chromium from your base image, maybe you could find it and try to set it to PUPPETEER_EXECUTABLE_PATH

bryzgaloff commented 3 years ago

@Carr1005 thank you for your reply. I have decided to go with another solution using Lighthouse for CI purposes.

But I have tried to use your tool within Docker with no success, here are two attempts that I have made:

Both fail with the following error:

RUN npm install -g lighthouse-batch-parallel
 ---> Running in 56ae25bc16f2
npm WARN deprecated mkdirp@0.5.1: Legacy versions of mkdirp are no longer supported. Please update to mkdirp 1.x. (Note that the API surface has changed to use Promises in 1.x.)
npm WARN deprecated request@2.88.2: request has been deprecated, see https://github.com/request/request/issues/3142
npm WARN deprecated har-validator@5.1.5: this library is no longer supported
/usr/local/bin/lighthouse-batch-parallel -> /usr/local/lib/node_modules/lighthouse-batch-parallel/lighthouse-batch-parallel.js

> puppeteer@1.20.0 install /usr/local/lib/node_modules/lighthouse-batch-parallel/node_modules/puppeteer
> node install.js

ERROR: Failed to download Chromium r686378! Set "PUPPETEER_SKIP_CHROMIUM_DOWNLOAD" env variable to skip download.
Error: EACCES: permission denied, mkdir '/usr/local/lib/node_modules/lighthouse-batch-parallel/node_modules/puppeteer/.local-chromium'
  -- ASYNC --
    at BrowserFetcher.<anonymous> (/usr/local/lib/node_modules/lighthouse-batch-parallel/node_modules/puppeteer/lib/helper.js:111:15)
    at Object.<anonymous> (/usr/local/lib/node_modules/lighthouse-batch-parallel/node_modules/puppeteer/install.js:64:16)
    at Module._compile (internal/modules/cjs/loader.js:999:30)
    at Object.Module._extensions..js (internal/modules/cjs/loader.js:1027:10)
    at Module.load (internal/modules/cjs/loader.js:863:32)
    at Function.Module._load (internal/modules/cjs/loader.js:708:14)
    at Function.executeUserEntryPoint [as runMain] (internal/modules/run_main.js:60:12)
    at internal/main/run_main_module.js:17:47 {
  errno: -13,
  code: 'EACCES',
  syscall: 'mkdir',
  path: '/usr/local/lib/node_modules/lighthouse-batch-parallel/node_modules/puppeteer/.local-chromium'
}
npm ERR! code ELIFECYCLE
npm ERR! errno 1
npm ERR! puppeteer@1.20.0 install: `node install.js`
npm ERR! Exit status 1
npm ERR! 
npm ERR! Failed at the puppeteer@1.20.0 install script.
npm ERR! This is probably not a problem with npm. There is likely additional logging output above.

npm ERR! A complete log of this run can be found in:
npm ERR!     /root/.npm/_logs/2021-03-01T08_22_37_919Z-debug.log
The command '/bin/sh -c npm install -g lighthouse-batch-parallel' returned a non-zero code: 1

This error can be avoided if PUPPETEER_SKIP_CHROMIUM_DOWNLOAD is set. So I kindly ask you to investigate this as running your tool within Docker is a useful case for CI.

Carr1005 commented 3 years ago

@bryzgaloff, thanks for your feedback! Seems the well integration with docker is worthy to be done, I'll try it asap. If you are willing to share your dockerfile or related scripts with gist, that would be a big help :)

Also if you would still like to give it a last try, I am thinking of adding --unsafe-perm=true --allow-root on your second case: echo -e 'FROM node:12\nRUN npm install -g lighthouse-batch-parallel' --unsafe-perm=true --allow-root | docker build -t lighthouse-batch-parallel

The first line of the error message just suggests avoiding the installation of puppeteer to avoid the permission problem for making a directory as a destination for download. But we DO want to download it.

ERROR: Failed to download Chromium r686378! Set "PUPPETEER_SKIP_CHROMIUM_DOWNLOAD" env variable to skip download.

Error: EACCES: permission denied, mkdir '/usr/local/lib/node_modules/lighthouse-batch-parallel/node_modules/puppeteer/.local-chromium'

There was a thread discussing the same error messages you got while installing puppeteer. TMALSS, I think it's about the behavior with different roles using npm install. If you are interested, here is a good article explaining it.

Carr1005 commented 3 years ago

@bryzgaloff, just for your information, I look into the Dockerfile of justinribeiro/lighthouse, you can see that it downloads a Chorme Application for Linux directly:

https://github.com/justinribeiro/dockerfiles/blob/8e60d235551a04b2095d41509d88668850ebccfa/lighthouse/Dockerfile#L28-L38

Like I said before, the lighthouse CLI would use 'chorme-launcher' to search for an existing chrome in the file system. So that's why that image can work without the problem we have here. I am still not sure about what you tried to achieve by this:

$ curl https://raw.githubusercontent.com/justinribeiro/dockerfiles/master/lighthouse/Dockerfile | sed 's/npm install -g lighthouse/npm install -g lighthouse-batch-parallel/g' | docker build -t lighthouse-batch-parallel

What I can suggest is you can also follow the pattern of that Dockerfile to download the chrome directly and utilize PUPPETEER_SKIP_CHROMIUM_DOWNLOAD=1 to skip the download of the puppeteer during the installation of this tool, find the path to the chromuim and set it to PUPPETEER_EXECUTABLE_PATH, then puppeteer can find it and launch it by puppeteer.executablePath().

bryzgaloff commented 3 years ago

Hi @Carr1005 thank you for extra details. As I have mentioned, I have decided to go with another solution. But still tried yours to help you :)

Unfortunately, installing your tool on top of pure node:12 image works the same way:

$ echo -e 'FROM node:12\nRUN npm install -g lighthouse-batch-parallel --unsafe-perm=true --allow-root' | docker build -t lighthouse-batch-parallel -
Sending build context to Docker daemon  2.048kB
Step 1/2 : FROM node:12
 ---> e0782a1551ac
Step 2/2 : RUN npm install -g lighthouse-batch-parallel --unsafe-perm=true --allow-root
 ---> Using cache
 ---> d1404b0b76a1
Successfully built d1404b0b76a1
Successfully tagged lighthouse-batch-parallel:latest

$ docker run --rm -it lighthouse-batch-parallel:latest bash
root@---:/# echo -e 'Device,URL\ndesktop,https://tproger.ru/' > input.csv
root@---:/# lighthouse-batch-parallel input.csv 
(node:19) UnhandledPromiseRejectionWarning: TypeError: Cannot read property 'disconnect' of undefined
    at /usr/local/lib/node_modules/lighthouse-batch-parallel/worker.js:81:15
    at processTicksAndRejections (internal/process/task_queues.js:97:5)
(node:19) UnhandledPromiseRejectionWarning: Unhandled promise rejection. This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch(). To terminate the node process on unhandled promise rejection, use the CLI flag `--unhandled-rejections=strict` (see https://nodejs.org/api/cli.html#cli_unhandled_rejections_mode). (rejection id: 1)
(node:19) [DEP0018] DeprecationWarning: Unhandled promise rejections are deprecated. In the future, promise rejections that are not handled will terminate the Node.js process with a non-zero exit code.

The first line of the error message just suggests avoiding the installation of puppeteer to avoid the permission problem for making a directory as a destination for download. … … I am still not sure about what you tried to achieve by this: …

My Idea was to use ready-to-use image with lighthouse (justinribeiro/lighthouse) assuming it has all dependencies properly installed and install your tool on top of that as a simple wrapper. However, looks like that does not work due to some reason.

In general, I highly recommend you to put some effort to make your awesome tool work with Docker. Thank you!

Carr1005 commented 3 years ago

@bryzgaloff, thank you so much for the try! I am sorry that it took your time but didn't help you. I'll make this tool work with docker asap. Again, thanks for pointing out the problems which give me the chance to make this tool better.