Closed AlexDawsonUK closed 6 months ago
Connect this into https://github.com/w3c/sustyweb/issues/11 as an evaluation methodology could be used within the framework of testing a potential test suite, and it's testability implications could be especially useful for the auditing process.
I’ve done a fair bit of progress on this and have reported my findings in https://github.com/w3c/sustyweb/issues/29#issuecomment-1793346202, as there is a lot in there about the potential for automation.
As far as website audits, there are a few interesting things to explore further, which I covered a bit in #29. I’ll add more of my perspective on two areas I think are particularly relevant for websites.
From my classification, only 107 of 232 SCs are testable without internal knowledge. I think this is potentially very problematic, as it means audits done with and without internal knowledge will be very hard to compare quantitatively. And even when I rated specific SCs as "consulting tasks" – the kind of knowledge some SCs require is easily lost, as it can span multiple organizations (internally for a large company, or with contractors).
I suspect WSG would be much easier to adopt for websites if it could be audited retroactively after the site has gone live, which the above two aspects make very hard (have to have internal knowledge, and good knowledge management practices).
Talking from my personal experience,
Practically speaking, it feels like very different jobs to audit a site based on publicly-available information, and do knowledge management consultancy to audit the SCs that require organizational knowledge.
Definitely, we do have a guideline (#5.9) regarding mandatory disclosures and reporting (this covers Policies And Practices as well as Impact Reporting and Reduction) which aims to increase business transparency, reduce greenwashing, and hopefully make the information required to perform an audit be available for public knowledge - but if businesses fail to declare, it can be problematic. It's something we hope to tackle with this document.
I posted most of this to the W3C Slack and @AlexDawsonUK forwarded me here (thanks again Alex).
It is important capture a selection of known pages (landing pages, home pages, search pages, forms), as well as a selection of random pages. This script takes large sitemaps and takes a random sampling to produce a smaller one.https://github.com/mgifford/purple-a11y/blob/master/sitemap-tools/sitemap-randomizer.py
I am planning to use this for accessibility scanning, but it also makes a lot of sense for sustainability scanning. Unlighthouse or Sitespeed.io are good for crawling sites, but neither of them gives you a random sampling from a larger collection. It really isn’t that statistically useful to assess just the first 1000 URLs that a crawler comes across. The other thing, is that when evaluating for sustainability, we want to be able to limit the computational power and bandwidth we are using to assess a site, while still providing a statistically meaningful set of representative pages.
This should work for getting Google Lighthouse scores:
https://unlighthouse.dev/guide/guides/url-discovery
I'd like to see Sitespeed.io's use of CO2.js here too, but I can't see how to just use a custom Sitemap.xml file:
https://www.thegreenwebfoundation.org/news/sitespeed-io-using-and-contributing-to-co2-js/
Note: There is a tool that the WAI offers to produce reports, something we could similarly offer in the future?
Note: Work is ongoing for a supplementary document containing human and machine guidance. Similar in scope to WCAG-EM but with WCAG Techniques and a Test Suite, this will offer practical advice to both implementors and tool makers.
As it is in development (a living document), expect constant evolution (additions, changes, removals), updates will be provided once it has reached a stable enough state to be considered "settled" and thus ready for inclusion and release.
@mgifford I am running a number of scans at Sitefig for various reason, including Accessibility, and would be interested in helping out on creating auditing tools. However, I am not a fan of the sampling method of the WCAG as it is very biased. It relies on client references, and knowledge of the website.
When the website is less than 1000 pages, which would cover must websites, it doesn't make sense to sample it as the chances of missing important pages has a high probability. Connecting to analytics would be much better as that shows active pages.
Or, as we do at Sitefig, build a complete map for the first setup, then sample important pages based on size, colors, classes, forms, javascript complexity, etc.. then match with analytics.
This sample provides a list URLs that more or less matches up with templates in the backend.
The other issues with the sampling method is that a website is managed and build by different departments in a company. And those departments have different agendas and velocities of making changes to the websites. Marketing updates pages based on the business agenda, while IT might have 2 month scrum cycle. Both types of updates potentially have an impact.
However, I do feel that scanning each and every page would not be compatible with the guidelines.
Additionally I would like to look into conflicts with other areas such as WCAG, performance and SEO.
For example, when talking to SEOs they care about performance, and the DPO cares about privacy.
When the WSG recommends to use a CDN, the SEO would argue not to because of core web vitals, neither would the DPO as their might be trackers attached to those CDNs.
I am sure there are more conflicts in this document. Discussing these potential conflicts might help to understand where these guidelines might create friction in web dev teams.
@astuanax Noted. I'll be sure to take into account your ideas into the draft. Hopefully we'll come up with something less clunky as I'm planning to re-write the WCAG-EM into something which has that broad appeal and is more in-line with the WSGs. I'll be sure to include a section on "pinch points", that's a great idea!
I love the idea. I used to do something like that regarding performance metrics back in the day. An excellent approach to metrics should help advise how to improve the development. We should add this to STAG but with the specification that as a methodology, it copes only the impact on the user side and not on the server side, where frameworks such as Impact Framework by Green Software Foundation (still in alpha) can be adopted.
My point is that having a framework such as IF or the idea proposed by @astuanax could help to identify relevant technical indicators (3.1) and to keep them monitored to build a metric to reduce the impact of the web application both on the frontend and the backend (i.e. rewrite an API which uses a lot of energy on the server or force its cache on the browser to reduce the number of requests are two faces of the same coin). In the past, I used to have performance KPIs such as:
These tools should not always be active (4.5) but turned on/off on structural changes in the code or during the Q&A process.
We have 2.27, 2.28, 2.29 (Incorporate * Testing Into Each Release-Cycle) should we had something similar also on web development? Ie. Incorporate Performance Testing Into Each Major Release Cycle
@fullo STAG is now STAR so correcting the link https://w3c.github.io/sustyweb/drafts/star.html
It would be interesting to have a combined tool which could be used to assess multiple criteria and help developers act on them. I am looking at tools like Unlighthouse.dev and Sitespeed.io that are designed to hit every page of a website for many metrics.
This can help, but it is still hard to find actionable items in them.
For accessibility, I'm enjoying Purple A11y (previously Purple Hats)
https://github.com/GovTechSG/purple-a11y-desktop
https://github.com/GovTechSG/purple-a11y
It does build on https://crawlee.dev
I'm not sure how hard it would be to incorporate a CO2.js scan with it.
@astuanax I don't think that we need to agree on a single set of URLs.
Maybe you do an annual scan with 100% of the URLs (if you can).
Perhaps you only hit the home pages, landing pages, and form elements with your CI/CD tests.
Biweekly scans (to align with your scrum process) might want to involve a set of hand-picked URLs and a consistent selection of links which have been randomly generated at the beginning of the epic (or maybe beginning of the year).
How you generate that list of URLs is something that you may want to optimize for speed, comparability, and statistical significance.
At some point we may develop a Sustainability Maturity Model, much like the W3Cs's Accessibility Maturity Model.
People at all stages will need approaches to testing.
@astuanax I don't think that we need to agree on a single set of URLs. No, probably not, as each site is different.
Maybe you do an annual scan with 100% of the URLs (if you can). Yes, Sitefig uses Lighthouse to scan for GDPR, Accessibility, SEO and does a brute force scan + selection.
Perhaps you only hit the home pages, landing pages, and form elements with your CI/CD tests.
Biweekly scans (to align with your scrum process) might want to involve a set of hand-picked URLs and a consistent selection of links which have been randomly generated at the beginning of the epic (or maybe beginning of the year).
Biweekly might be aligned with scrum, but that might not align with the agenda of the marketing department. Websites are a joint effort across different departments. The DPO might select a CMS that does not play by the rules. Considering the website an IT thing is, IMO, what should be avoided as everyone who uses it in the organisation should take at least some responsibility. Definitions of success criteria that only IT can grasp should be avoided at all cost.
This is one of the issues with WCAG. Most large organisations hand off accessibility to the IT department, while design and content is usually handed by marketing who most of the time think it is not their responsibility.
How you generate that list of URLs is something that you may want to optimize for speed, comparability, and statistical significance.
Yes, I agree on this, but that means taking into account changes made by all departments and their agenda/schedule. This is usually not the scrum schedule.
At some point we may develop a Sustainability Maturity Model, much like the W3Cs's Accessibility Maturity Model.
People at all stages will need approaches to testing.
Exactly, that is why it is difficult to let IT select URLs and test that in the CI pipeline. I think that CI pipelines are not the place to audit.
Marketing or Legal changes might have as much impact as other changes made by developers and those will be missed completely by a CI pipeline because they happen when the software is used, long after the CI pipeline ran.
In a recent call with a customer, he pointed me to the EcoCode as a plugin for analyzing quality impact on Sonarqube. This plugin gets its test from a (french/eng) open book named Les 115 bonnes pratiques. I didn't know about EcoCode, but due to its integration with a main tool to do code quality, I got curious.
I think we should also build the sustyweb checklist (for the technical subset where possible) as a compatible plugin for software like Sonarqube or sitespeed.io.
Marketing or Legal changes might have as much impact as other changes made by developers and those will be missed completely by a CI pipeline because they happen when the software is used, long after the CI pipeline ran.
It is true, but we should at least remind the developer that they have the responsibility to run (sometimes/automagically/on a fixed frequency) a scan of the pages to check their impact (in a CI/CD process or not) and to bring those results to an open table inside the company.
@fullo Our guidelines do integrate with that open book in many instances (we cite them within the references section).
Regarding integration, we already have primary plans for a Lighthouse plugin (#54) as integration within browser DevTools is likely to be the most useful way to reach developers due to it being available at no cost and built into browsers. However, once STAR is complete and our testing methodology and EM are available it shouldn't be too difficult for developers of products like those you mentioned to be WSG compliant and create their own plugins.
a couple of days ago the Green Software released a video about Impact Framework, it is fostering the idea of Sustainability as Code and I love it. This goes to the direction of this discussion
Its taken a while but we now have a draft evaluation methodology which takes into account WCAG EM, along with many of the comments and criticisms of its methodology mentioned above (plus orientating it towards a digital sustainability standpoint). I've tried to align it towards full site analysis rather than sampling as there is legitimate criticism regarding potential missed details and its relevancy in small sites. Other aspects of the EM focus primarily on appealing to success criteria rather than targeting conformance as the WSGs don't have a specific grading scheme like WCAG (we're more about progress than perfection).
With this in mind, it will be evolved over time and hopefully it will prove useful. It will be published with STAR in this months scheduled release on Earth Day! (22nd April) - It seemed appropriate for such a large feature release.
Create a supplement document regarding methodological considerations (guidance, walkthrough, etc) of doing a an audit. Similar to WCAG-EM, except focus on implementation & techniques rather than conformance.
Credit: @thibaudcolas