Better docs on writing custom audits

kahunacohen commented 6 years ago

Feature request summary I'd like better docs explaining how to write custom audits. Yes I've seen the docs, but 1) your custom audit example gives an error in the gatherer, 2) I'd like high level instructions on how to write arbitrary audits. API docs for this?

I am willing to contribute these docs if I could figure out how to write my own gatherers and audits. I've read the source code and I still find it very confusing. Axe audits vs regular audits? Am I supposed to use the chrome debugging API? What about just simple querying of the DOM?

My organization would like to adopt Ligthhouse, but we need to write our own tests.

Others could use clearer docs without having to delve into the source code.

kahunacohen commented 6 years ago

This is the error when running the custom audit as per the docs: Caught exception: Required TimeToSearchable gatherer encountered an error: Unable to find load metrics in page

This is the command I entered: $ lighthouse --chrome-flags="--no-sandbox --headless" --config-path=./custom-config.js https://www.ncbi.nlm.nih.gov

Again I would be happy to contribute better docs for custom runs...

patrickhulce commented 6 years ago

Thanks for reporting @kahunacohen we'd be happy to have help improving the docs!

The example though is just pointing you how to create a gatherer for a fictional searchable component and assumes that you have some similar metric in the page already that you care about and want to collect. window.myLoadMetrics is not a real thing, it's a stand-in for whatever load metrics you want to collect. If you don't have anything about the page you want to measure on your own, then a custom gatherer isn't necessary and you can probably just create custom audits with the existing artifacts :)

I think there's a good opportunity to document what artifacts we already have available and provide some examples of custom audits one could write without needing a new gatherer. Are there any other ideas you have or particular parts you're stuck on that need improving?

kahunacohen commented 6 years ago

Thanks for the response @patrickhulce . This is great!

We have had a someone similar home-grown framework for our organization for years. I just started rewriting it using headless Chrome/puppeteer to replace our PhantomJs version and came across Lighthouse. I'd rather join forces.

I'd like to create a document that is self-contained for someone wanting to write custom tests. I did successfully write a few custom tests. Here are two custom NCBI-written audits (National Center for Biotechnology). The ncbi_app audit is to test for a required meta tag for our logging framework. The links in button audit is to catch markup that can cause rendering issues in some browsers.

We have many other tests that are not covered by LH but are probably too specific to our sites that I'd like to write. First I'm going to link to the gists. Perhaps you could provide a code-review which will help me better understand some things:

ncbi-config.js: https://gist.github.com/kahunacohen/4f2bbfd78aa2d7dc1d9f71a8237645b3
ncbi_app-gatherer.js: https://gist.github.com/kahunacohen/a623d4597f2d60a7ce905d7cc506b93d
ncbi_app-audit.js: https://gist.github.com/kahunacohen/8dc17c549c65f22284bfd7dce6164560
links-in-buttons-gatherer: https://gist.github.com/kahunacohen/5e30a95d41a33b1b365bfaaefb9c94df
'links-in-buttons-audit`: https://gist.github.com/kahunacohen/9bc12124e29ee660e3925393eb94c1db

General feedback appreciated. Some specific questions:

So are you saying gatherers are not always required? It does seem to me odd to always separate the two. Audits are often left with just meta data, but I do understand the separation of concerns. Reasons why you separated the two?
Why only send strings to driver.evaluateAsync? It seems odd to have to define code as a string only to have it evaluated. Why not pass a function to evaluateAsync?
What artifacts are available to the audits by default? Are they sent without providing a gatherer? Are there docs on this? Eg. let's say I want to write an audit with info from the DOM? How do I do this other than the way I did? What about another accessibility test? How would I do that?
What's up with the axe audits? Did you guys just take over their project?

Thanks for any feedback. Also it might be useful to have auto generated API docs from the lighthouse code..I would be willing to do that too...

patrickhulce commented 6 years ago

That's awesome! A lot to go through, so I'll try to respond inline, left some comments on those gists :)

So are you saying gatherers are not always required?

Correct, if you have audits that look at images, or network requests, or console messages (errors, deprecations, warnings, console.log, etc) for example, all of that is already available in the default set of artifacts so there's no need to try to collect those yourself.

Reasons why you separated the two?

There are a few reasons for this, separation of concerns is certainly one of them, but also the clean break between gathering, aka "talking to the browser about the page", and auditing, aka "passing judgement on what we gathered about the page", means less repetitive audits doing duplicate work to get network information, fewer round trips communicating with the browser, and you can do cool things like save the information about the page in a particular state to a server somewhere and then run audits much later with even different versions of the code! Cleanliness, performance, functionality.

Why only send strings to driver.evaluateAsync?

Whatever gets sent over evaluateAsync needs to be serialized and sent over a websocket as a string, so passing in an arbitrary function can get a little misleading since the arbitrary state/closures you might expect to be there won't be sent along with it. As long as you use only locals/window, you can definitely stringify your functions at the very last step like below

const OUTSIDE_VAR = 1

function myCoolFunction() {
  // This is executed in the browser
  // `document` works but you can't use OUTSIDE_VAR!
  const text = document.body.textContent;
  return text.trim().replace(/whatever/, 'you-want')
}

// This is executed in node
driver.evaluateAsync(`${myCoolFunction.toString()}()`)

What artifacts are available to the audits by default? Are they sent without providing a gatherer?

Here's what artifacts are available by default. Any of those names you can just inject into your requiredArtifacts array in your audit and they'll be there for you to use. https://github.com/GoogleChrome/lighthouse/blob/36ba984589de39372cb9caf15d8fc31198c04dce/typings/artifacts.d.ts#L48-L122

let's say I want to write an audit with info from the DOM? How do I do this other than the way I did?

Because there's so much DOM information, if you have something pretty specific about the DOM you want to check, you'll probably need to write a gatherer yourself although after looking through your specific audits there's probably a good opportunity for us to make some of our gatherers more generic to solve your usecase (i.e. we already look at several meta tags and most links, we could probably have some generic artifacts representing that data and you wouldn't need gatherers)

What's up with the axe audits? Did you guys just take over their project?

We haven't taken over anyone's project, but we really appreciate their work for sure! We bundle axe-core with Lighthouse and created audits for most of their rules. There aren't any LH-specific modifications we've made to the library we just run it as-is from node_modules and report what it found. That's why you'll see us refer any accessibility bugs to their repo. If you wanted to enhance the accessibility audits, adding a rule to their repo would be a huge impact to the entire community for sure!

Hope I got to everything, good questions!

AymenLoukil commented 6 years ago

Hello,

I just wrote a tutorial on how to create a custom audit. It is about testing the speed of loading the hero image of a page with User Timing API. I would love to hear your comments : https://www.aymen-loukil.com/en/blog-en/google-lighthouse-custom-audits/ @patrickhulce @kahunacohen

patrickhulce commented 6 years ago

Nice @AymenLoukil! You could probably make a few tweaks to the code from the stock recipe we have to make it a little more obvious what's possible :)

i.e. in your custom gatherer, do .evaluateAsync("window.performance.getEntriesByType('mark')[0].startTime;") instead of the myCustomMetric business. Then explain that it could be any javascript evaluation so you don't need to make that many changes to your production code, LH can do some work too!

AymenLoukil commented 6 years ago

Thank you @patrickhulce! Good point ! I wanted to give a very simple example for the tutorial. I added that the gatherer is able to do much more things ;) I tried to do it but got an exception with .evaluateAsync("window.performance.getEntriesByType('mark')[0].startTime;")

GoogleChrome / lighthouse

Better docs on writing custom audits #6392