GoogleChrome / lighthouse

Automated auditing, performance metrics, and best practices for the web.
https://developer.chrome.com/docs/lighthouse/overview/
Apache License 2.0
28.22k stars 9.35k forks source link

Convert Accessibility Gatherer to streaming results #11028

Open mikedijkstra opened 4 years ago

mikedijkstra commented 4 years ago

When running Lighthouse (on a URL I can't share as it is private and requires cookie authentication) it throws a PROTOCOL_TIMEOUT error during the Accessibility Gatherer. Ultimately the call driver.evaluateAsync with the axe expression doesn't return in time and the protocol timeout is triggered.

This is a really hard issue to debug as it's only happening when I run in Docker (not on my local mac) and therefore I can't easily see what's happening in the browser or why it's hanging. This isn't the first time I've come across an issue like this and we've been able to continue testing a site by skipping these audits.

Rather than throw a PROTOCOL_TIMEOUT and ultimately cancelling the whole run when the accessibility gatherer fails like this, would you be open to a PR which would handle this more gracefully? Ultimately allowing the Lighthouse run to finish without any accessibility score or audits as it would if the No axe-core results returned error was thrown.

Environment Information

patrickhulce commented 4 years ago

Thanks for filing @mikedijkstra! Part of the problem here is that if axe hangs all of Chrome, we can't really recover very gracefully. axe isn't the last thing we do so if Chrome stops responding we kind of have to give up.

I agree we can try to handle this more gracefully though.

Rough Idea:

benschwarz commented 4 years ago

axe isn't the last thing we do so if Chrome stops responding we kind of have to give up.

Can it become the last operation of the Lighthouse suite?

Part of the problem here is that if axe hangs all of Chrome

Do we have a clear idea on how regularly it hangs the entire browser?

axe seems to have regular compatibility issues and breakages that jeopardize the ability for users to have successful tests on certain sites. It'd be great to make Lighthouse more resilient to those possible failures.

patrickhulce commented 4 years ago

Can it become the last operation of the Lighthouse suite?

Not really. There are passes that must happen after the first pass and axe must run on the fully loaded page. Even if we could make it the last gatherer, there would always be general cleanup.

Do we have a clear idea on how regularly it hangs the entire browser?

I would not expect it to be often that it is irrecoverable. I'd expect killing the isolated context to get the job done in most cases.

connorjclark commented 4 years ago

We talked before about running axe in chunks, such that a bad axe check doesn't kill the entire artifact. axe does some shared pre-work on each run, so the trick is breaking it up without increasing total time (a little bit would be ok)

even better, it'd be cool to introduce a streaming API within axe itself.

patrickhulce commented 4 years ago

Yeah I think just splitting out color-contrast has huge potential. IIRC that's frequently the source of hanging forever problems and could still get 95% of the a11y category done.

patrickhulce commented 3 years ago

Two parts to this issue:

connorjclark commented 2 years ago

Not sure if streaming is the best approach here, but I wrote some notes on how we might do it in the Axe gatherer

in page: function*() { running axe, sometimes yeilds a single axe chck result }
get RemoteObject ref to the generator
  or: keep around via page state, window.___generator, if we can't use the remoteobject for some reason?
in a timeout:
  Gathering land: call page function that calls generator–keep calling .evaluate until no more yeilds
  if protocol timeout happens, we stop
return artifact

We should be able to do all of that w/o changing the gathering phase api. But if we wanted too, we could change collectPhaseArtifacts to handle generators...