cityssm / lighthouse-scans

A script to scan the City's websites for performance, accessibility, and best practice issues.
https://cityssm.github.io/lighthouse-scans/
MIT License
3 stars 1 forks source link

Crawl vs List #240

Open mgifford opened 2 years ago

mgifford commented 2 years ago

1stly - Cool!

SSM rocks.

Think this is the first time I've said this.

So does this hit a single page, crawl a whole site or just run through a list?

It isn't clear to to me how much needs to be defined in your Lighthouse scan.

I'm also unclear how multi-page content is agregated.

Sorry for not RTFMing this.

dangowans commented 2 years ago

Thanks for your interest @mgifford . I don't think there's a manual to read, so you're forgiven, 😄

Each website builds it's own list of pages to crawl using a buildConfig file, located in the sites folder. Here's the one for SaultSteMarie.ca.

import { writeConfig } from "../../utils.js";

(async () => {

  await writeConfig([
    "https://saultstemarie.ca/",
    "https://saultstemarie.ca/Search.aspx?searchtext=parks",
    "https://saultstemarie.ca/webapps/meetingMinutes.asp?type=council",
    "https://saultstemarie.ca/webapps/corporateCalendar.asp?e=true",
    "https://saultstemarie.ca/webapps/parabusCalendar.asp",
    "https://saultstemarie.ca/webapps/parksAndPlaygrounds.asp"
  ], [
    "https://saultstemarie.ca/"
  ],
    "saultstemarie");
})();

So there's two sections of URLs. The first list are pages that may not appear when crawling the website. The second list are pages that should be crawled.

The depth of the crawl is defined in the global config file.

In the end, after combining the list of crawled URLs with the list in the build file, a random selection of URLs is picked, based on the limit set in the config file. This ensures that the GitHub action can complete before the time limit. For example, there are hundreds of pages on the City website. It takes too long to scan them all.

So to do a scan on SaultSteMarie.ca, after installing the project, two scripts are run.

npm run build:website:saultstemarie
npm run test:website:saultstemarie

The build script uses the config files to build a fresh lighthouserc.json file. The test script runs Lighthouse tests on that lighthouserc.json file using lighthouse-ci.

A GitHub Action runs for each website daily. It builds and tests.

If any page in the lighthouserc.json file doesn't meet the thresholds, the GitHub Action is marked as failed. I use badges from shields.io to show the results of the last run.

Does all that make sense?