sitespeedio / plugin-lighthouse

Lighthouse plugin for sitespeed.io
MIT License
30 stars 19 forks source link

Add support for --preScript #13

Closed gidztech closed 5 years ago

gidztech commented 5 years ago

Browsertime supports a preScript flag, which gives you the opportunity to use Selenium to login to your application before running an audit. However, the flag doesn't get used by the Lighthouse plugin.

Side note: Prior to finding this out, I was struggling to debug what was happening with the Lighthouse audit. I was just getting null back for most things. I logged an https://github.com/sitespeedio/plugin-lighthouse/issues/12 separately for improving debugging of the plugin.

soulgalore commented 5 years ago

Hi @gidztech at the moment we use Lighthouse as a standalone application, so that Lighthouse opens an own Chrome instance (after Browsertime is finished). To make it happen, we need to run the Lighthouse tests with the same browser instance as Browsertime ... I think maybe you partly can do that but not 100% sure (I don't know exactly how the Lighthouse tests runs).

Best Peter

gidztech commented 5 years ago

@soulgalore We're actually currently using chrome-launcher to run an instance of Chrome, and then connecting Lighthouse to it via the port exposed by the launcher.

I think it should (hypothetically) be possible to re-use the Browsertime instance. The idea is we keep Browsertime open until Coach, Lighthouse etc. have finished executing. I don't know the technicals behind this, but maybe each plugin tells SitespeedIO when it's finished, and then as soon as all pending plugins are done, it tells Browsertime to close it's session.

The Lighthouse plugin would just pass the port exposed by Browsertime to Lighthouse.

There could be another option that I'm not currently aware of. But being able to use the pre-script for the Lighthouse audit is an important feature to have.

soulgalore commented 5 years ago

Hi @gidztech it will not possible to add to the current instance that Browsertime is using, because then we need to integrate Lighthouse directly into Browsertime and will not happen, I'll will explain later on why.

But first technically how it works now: Browsertime runs standalone and keep track of track of Chrome, so it starts/stops Chrome through the webdriver (that then use Chromedriver). Then you can feed Browsertime JavaScript that runs when the page has loaded (that's how the Coach adds it JavaScript). So when Browsertime has finished the tests, it closes the browser. I think Lighthouse do multiple runs through the same browser (open/closing the instance right) to try out things like no connectivity etc.

The way to make it work in the future is that Lighthouse adds support for something like scripting (like we have today with WebPageTest) so it's totally separate, that will make it cleaner too.

Personally I don't think integrating Lighthouse into Browsertime is a good idea, since Lighthouse/Chrome/Google is the opposite of what we try do with the sitespeed.io: Google supporting Trump, making money out of users in shady ways etc. I understand some devs don't care but for me its important. But feel free to fork and integrate it.

Best Peter

gidztech commented 5 years ago

@soulgalore I'm not seeing why you'd need to bring Lighthouse directly into Browsertime to achieve what I'm asking, but maybe I'm missing something. The basic concept I had was that Browsertime just needs to be told that there are some extra steps to wait for, not just Coach.

Browsertime doesn't need to know about the internals of Lighthouse or require any package dependency. Instead it triggers each Browsertime plugin (Coach, Lighthouse, Foo, Bar, etc.) to execute in sequence, and then when all have finished executing, then it can close the browser. You just need to pass disableStorageReset: true to Lighthouse (via the Lighthouse plugin) to make sure Lighthouse doesn't clear the cookies/storage you've set prior to running the audit.

There was some work done a while ago in Lighthouse to pass in `extraHeaders. This allows you to pass in cookies, but you need to actually create the cookies in the first place, which involves some script (even if it's not via a browser). My understanding is people currently use Puppeteer or similar to run a login flow before Lighthouse audits.

The problem with separating the two is it would mean having to have two separate CI flows. The first for SitespeedIO using Coach and Browsertime, and the other for Puppeteer and Lighthouse. Being able to see both sets of scores in one report, with graphs and historic stats would be the ultimate goal.

Let me know what you think.

soulgalore commented 5 years ago

We been discussing adding plugins in browsertime (like we have in sitespeed.io), it could be useful for a lot of things (taking care of the result, sending metrics etc) but then kept Browsertime raw and doing that in sitespeed.io.

However maybe there's other ways of doing it within the plugin? Before we had prescripts users did hacks like https://github.com/edx/edx-sitespeed to login the user and adding a header to the tests. Another way in the future could be to just use Puppeteer as in https://github.com/GoogleChrome/lighthouse/issues/3837 when it works, then there could be a param to add your own pre script.

gidztech commented 5 years ago

I came up with a solution if we wanted to support a pre-script independently for the Lighthouse plugin. It's a little bit hacky though, so there might be some improvements we can make.

We alter the Lighthouse plugin to support a new flag like --lighthousePreScript, which takes a path to a Node script. Using Puppeteer, we launch Chrome and pass the WebSocket endpoint to that script.

// puppeteer launch above
if (lighthousePreScript) {
    await exec(`node ${lighthousePreScript} --wsEndpoint ${browserWSEndpoint}`);
}
// lighthouse launch below

In my own script, I can connect to Puppeteer, run my login steps, and then disconnect from it.

  1. Install the npm dependencies needed to run the automation. We need to create a temp node package to install the dependencies to. I found I needed to use yarn because of NPM permissions with root user I think (there's probably something we can do to fix that).
await exec('npm install -g yarn && npm init --yes && yarn add minimist puppeteer', { cwd: TMP });
  1. Grab the wsEndpoint argument that was passed down to the script, and connect Puppeteer
const minimist = require('minimist');
const { wsEndpoint } = minimist(process.argv.slice(2));
const browser = await puppeteer.connect({
    browserWSEndpoint: args.wsEndpoint
});
  1. Run my login steps (e.g. page.goto, page.click, etc. commands)
 await loginSteps(browser);
  1. Disconnect Puppeteer
    browser.disconnect();

At this point, the node script will exit and Lighthouse will run. Since the cookie has been set for the domain, when Lighthouse tries to access the URL, it will be logged in already, much like the Selenium version with Browsertime.

As you can see, I managed to get the audit to work with this:

lighthouse

Clearly there's some improvements needed to the app I ran it on! Looks like the app I'm running the audit on is not good performance-wise on a throttled connection. PS. The PWS score doesn't seem to be rounded.

soulgalore commented 5 years ago

Hi @gidztech ah cool, this works better for me :) If you have time, do a PR and then we can try to fine tune it together? What would help me would also be that we add a test in Travis so we test the pre-script functionality too, so we know we don't break it.

Best Peter

softwareklinic commented 2 years ago

Does this approach work? - using lighthouse prescript -- with sitespeed