w3c / sustyweb

Sustainable Web Design Community Group
https://www.w3.org/community/sustyweb/
Other
175 stars 12 forks source link

Manual VS Automated Testing #29

Closed AlexDawsonUK closed 7 months ago

AlexDawsonUK commented 1 year ago

Linked to https://github.com/w3c/sustyweb/issues/11, Provide text indicators (within the WSG's) for which guidelines will be able to be tested automatically using a testing tool, and also which guidelines will require manual testing and are outside of the scope or capability of automation.

Credit: @thibaudcolas

marvil07-adapt commented 1 year ago

First, thanks for the WSG draft, it is shaping as a great overall guide for sustainability on the web!

It would be great to have tooling around some of the guidelines, and it seems like the first step may be to identify what is possible to be automated.

Following, a first pass trying to identify if automated testing can be used to indicate compliance for each of the WSGs. Naturally, once there are metrics associated, as expected over #11, there may be changes to make.

A yes is indicated when it is clear that some metrics may be available to automate the evaluation of the success criteria. A no is indicated when no automated testing may be applicable. A partial is indicated when only a subset of the success criteria could be evaluated in an automated way.

The hints column tries to use some tags that can help provide extra context around the automated testing value used.

automated testing WSG hints
no 2.1 Undertake Systemic Impacts Mapping internal
no 2.2 Assess And Research Visitor Needs internal
no 2.3 Research Non-visitor's Needs internal
no 2.4 Consider Sustainability In Early Ideation internal
no 2.5 Account For Stakeholder Issues internal
partial 2.6 Create a Frictionless Lightweight Experience By Default
partial 2.7 Avoid Unnecessary Or An Overabundance Of Assets editorial
partial 2.8 Ensure Navigation And Way-finding Is Well-structured
partial 2.9 Respect The Visitor's Attention
no 2.10 Use Recognized Design Patterns
partial 2.11 Avoid Manipulative Patterns
no 2.12 Document And Share Project Outputs internal
no 2.13 Use A Design System To Prioritize Interface Consistency internal
partial 2.14 Write With Purpose, In An Accessible, Easy To Understand Format
partial 2.15 Take a More Sustainable Approach To Image Assets editorial
partial 2.16 Take a More Sustainable Approach To Media Assets editorial
no 2.17 Take a More Sustainable Approach To Animation editorial
yes 2.18 Take a More Sustainable Approach To Typefaces
partial 2.19 Provide Suitable Alternatives To Web Assets
no 2.20 Provide Accessible, Usable, Minimal Web Forms
partial 2.21 Support Non-Graphic Ways To Interact With Content
no 2.22 Give Useful Notifications To Improve The Visitor's Journey
partial 2.23 Reduce The Impact Of Downloadable Or Physical Documents
no 2.24 Create A Stakeholder-focused Testing & Prototyping Policy internal
no 2.25 Conduct Regular Audits, Regression, And Non-regression Tests internal
no 2.26 Analyze The Performance Of The Visitor Journey internal
no 2.27 Incorporate Value Testing Into Each Major Release-cycle internal
no 2.28 Incorporate Usability Testing Into Each Minor Release-cycle internal
partial 2.29 Incorporate Compatibility Testing Into Each Release-cycle
no 3.1 Identify Relevant Technical Indicators editorial
yes 3.2 Minify Your HTML, CSS, And JavaScript
partial 3.3 Use Code-splitting Within Projects internal
no 3.4 Apply Tree Shaking To Code internal
partial 3.5 Ensure Your Solutions Are Accessible
no 3.6 Avoid Code Duplication internal
no 3.7 Rigorously Assess Third-party Services
partial 3.8 Use HTML Elements Correctly
yes 3.9 Resolve Render Blocking Content editorial
partial 3.10 Provide Code-based Way-finding Mechanisms semantics
partial 3.11 Validate Form Errors And External Input
partial 3.12 Use Metadata Correctly editorial semantics
partial 3.13 Adapt to User Preferences editorial
partial 3.14 Develop A Mobile-first Layout
no 3.15 Use Beneficial JavaScript And Its API's
partial 3.16 Ensure Your Scripts Are Secure internal
no 3.17 Manage Dependencies Appropriately internal
yes 3.18 Include Files That Are Automatically Expected
yes 3.19 Use Plaintext Formats When Appropriate
no 3.20 Avoid Using Deprecated Or Proprietary Code internal
no 3.21 Align Technical Requirements With Sustainability Goals
yes 3.22 Use The Latest Stable Language Version internal
no 3.23 Take Advantage Of Native Features internal
no 3.24 Run Fewer, Simpler Queries As Possible internal
partial 4.1 Choose A Sustainable Hosting Provider
partial 4.2 Optimize Browser Caching editorial
yes 4.3 Compress Your Files
no 4.4 Use Error Pages And Redirects Carefully editorial
no 4.5 Limit Usage Of Additional Environments internal
no 4.6 Automate To Fit The Needs internal
no 4.7 Maintain a Relevant Refresh Frequency editorial
no 4.8 Be Mindful Of Duplicate Data. internal editorial
no 4.9 Enable Asynchronous Processing And Communication editorial
yes 4.10 Use Edge Computing
no 4.11 Use The Lowest Infrastructure Tier Meeting Business Requirements internal
no 4.12 Store Data According To Visitor Needs internal editorial
no 5.1 Have An Ethical And Sustainability Product Strategy internal
no 5.2 Assign A Sustainability Representative internal
no 5.3 Raise Awareness And Inform internal internal
no 5.4 Communicate The Ecological Impact Of User Choices internal
partial 5.5 Estimate A Product Or Service's Environmental Impact internal
no 5.6 Define Clear Organizational Sustainability Goals And Metrics internal
no 5.7 Verify Your Efforts Using Established Third-party Business Certifications internal
no 5.8 Implement Sustainability Onboarding Guidelines internal
no 5.9 Support Mandatory Disclosures And Reporting internal
no 5.10 Create One Or More Impact Business Models internal
no 5.11 Follow A Product Management And Maintenance Strategy internal
no 5.12 Implement Continuous Improvement Procedures internal
no 5.13 Document Future Updates And Evolutions internal
no 5.14 Establish If A Digital Product Or Service Is Necessary internal
no 5.15 Determine The Functional Unit internal
no 5.16 Create A Supplier Standards Of Practice internal
no 5.17 Share Economic Benefits internal
no 5.18 Share Decision-making Power With Appropriate Stakeholders internal
no 5.19 Use Justice, Equity, Diversity, Inclusion (JEDI) Practices internal
no 5.20 Promote Responsible Data Practices internal
no 5.21 Implement Appropriate Data Management Procedures internal
no 5.22 Promote Responsible Emerging Technology Practices internal
no 5.23 Include Responsible Financial Policies internal
no 5.24 Include Organizational Philanthropy Policies internal
no 5.25 Plan For A Digital Product Or Service's Care And End-Of-Life internal
no 5.26 Include E-waste, Right-to-repair, And Recycling Policies internal
partial 5.27 Define Performance And Environmental Budgets internal
no 5.28 Use Open Source Tools internal

Hints meaning on the table follows.

AlexDawsonUK commented 1 year ago

Thanks @marvil07-adapt for making a first attempt to identify which guidelines will be able to be tested against. We will definitely, be adapting this into the version which ends up in the specification.

As you mentioned, tooling support would be incredibly useful and it's on our roadmap for inclusion. Aside from this issue (labelling), and #11 (which you mentioned, covering a test suite for implementations & techniques), we will also be producing auditing guidance (#28) and implementation guidance for tooling & user-agents (#22). Hopefully this will arrive as soon as we can but it may take a draft or two as we're attempting to connect it all together (for cohesiveness).

thibaudcolas commented 1 year ago

I’ve done a very similar assessment :) I get to go second so I thought I’d share my assessment, and also a comparison table. I chose to do this at the level of Success Criteria, which makes for a very long table, so I put the tables in a gist:

My classification

Here is how I personally rated the SCs (and how many SCs I found for each rating):

  1. Static analysis (6): Potential to write automated code checks that would run in CI / developer IDEs. Example: jsx-a11y/alt-text
  2. Automated (35): "Runtime" analysis – potential to inspect the product with automated browsing or equivalent and detect issues. Example: Axe
  3. Manual, quantitative (16): Likely manual auditing but with potential to follow a set scoring algorithm. Possible to create semi-automated tools to help with auditing. Example: Tab stops testing
  4. Manual, qualitative (50): Manual auditing with an element of interpretation. Can be done with publicly available information, reproducibility of findings a possible concern. Example: WCAG SC 3.2.4 Consistent Identification
  5. Consulting (125): Manual auditing requiring interpretation and internal knowledge of the project / organisation. Cannot be audited without behind-the-scenes access.

Comparison with @marvil07-adapt

After converting this classification to the one by @marvil07-adapt, here is how we differ in the comparison table:

Mapping from my classification to yes/no/partial (internal). I also used "partial" if a guideline was a mixture of "yes", "no", or "partial".

Differences

@marvil07-adapt @thibaudcolas Guideline
no – internal yes 3.4 Apply Tree Shaking To Code
no – internal yes 3.17 Manage Dependencies Appropriately
no – internal yes 3.20 Avoid Using Deprecated Or Proprietary Code

Here are my thoughts on the three guidelines where we differ:

Partial matches

Note when I reviewed the potential for automation, my focus was primarily on auditing a website or app with no internal knowledge. So for most SCs that require internal knowledge I rated them as "consulting" / "no - internal", even if there could be automation. There are a few exceptions, such as 3.22 Use The Latest Stable Language Version.

I didn’t assess our differences here in much detail. At a high level:

@marvil07-adapt @thibaudcolas Guideline
partial – editorial no 2.7 Avoid Unnecessary Or An Overabundance Of Assets
partial no 2.8 Ensure Navigation And Way-finding Is Well-structured
partial no – internal 2.9 Respect The Visitor's Attention
partial no 2.11 Avoid Manipulative Patterns
partial no 2.21 Support Non-Graphic Ways To Interact With Content
partial – internal yes 3.3 Use Code-splitting Within Projects
partial yes 3.8 Use HTML Elements Correctly
yes – editorial partial 3.9 Resolve Render Blocking Content
partial – editorial, semantics yes 3.12 Use Metadata Correctly
partial – editorial yes 3.13 Adapt to User Preferences
partial no 3.14 Develop A Mobile-first Layout
partial – internal yes 3.16 Ensure Your Scripts Are Secure
partial – editorial yes 4.2 Optimize Browser Caching
no – editorial partial 4.4 Use Error Pages And Redirects Carefully
no – editorial partial 4.7 Maintain a Relevant Refresh Frequency
no – editorial partial 4.9 Enable Asynchronous Processing And Communication
yes partial 4.10 Use Edge Computing
partial – internal no – internal 5.5 Estimate A Product Or Service's Environmental Impact
no – internal partial 5.19 Use Justice, Equity, Diversity, Inclusion (JEDI) Practices
no – internal partial 5.22 Promote Responsible Emerging Technology Practices
partial – internal no – internal 5.27 Define Performance And Environmental Budgets
AlexDawsonUK commented 1 year ago

This is great stuff, thanks for putting in all the hard work! It will certainly help guide us alongside the other testability criteria we are producing to help make the specification more robust.

airbr commented 1 year ago

Just a comment to say thanks for all the good work @thibaudcolas - it immediately brought forward an important question to me generally: "Which of these guidelines need special access to work on" or as you describe it as needing internal knowledge. It is a good contextual question- is this for people with special access or internal knowledge or anyone potentially?

This is an important consideration as the guidelines progress into more use cases and adoption. Thanks for bringing it to the forefront.

AlexDawsonUK commented 10 months ago

Once STAG reaches a settled state (and the testability component is verifiable), this issue will progress with #11.

mgifford commented 9 months ago

Just updating link from STAG to STAR.

AlexDawsonUK commented 8 months ago

I've taken the above information and fed it into a spreadsheet, also utilizing the data from EcoGrader (who are also seeking to machine-test the WSGs into their product suite), and finally used my own interpretation of the spec (which I'll be feeding into STAR Techniques and the upcoming test suite) to come up with some potential ways of testing for compliance. Rationale is provided for why a SC cannot be tested, and if testing can be done, an example is given for criteria purposes.

Note: Because the testability is linked to Techniques, I've made assumptions (for the purposes of simplicity) that internal access will be available (if its possible in certain cases then an assumption can be made that testability can be true in certain cases making it at least a partial pass), but as it's been mentioned this will not always be the case, and where such access is required, it will be noted within the relevant tests and can be marked as such (with a caution note).

Source: Testability. Feedback is useful! As is further conversation on this topic. There will be more than one way of interpreting a Success Criteria as machine testable (as there is with WCAG Techniques), so consider this a starting point.

AlexDawsonUK commented 7 months ago

Its taken a LOT of work, but we now have Manual VS Automated testing available within the specification. It currently exists within the living draft but will be published within this months scheduled release.

The content for testability as noted above was derived from multiple sources including contributors here and my spreadsheet, and its all been compiled into a set of machine testable techniques (now all available in STAR).

With this in mind, the techniques that could be built, as in testable, have been cross-linked into the main specification and are used as evidence of a possible way it could be approached by toolmakers, etc. As this task is now considered complete (though more cases can be added to enhance STAR in the future) I'll close the issue as a completed feature.

Note: In this example, you can see a mixture of testable and non-testable criteria (Success Criteria indicate such a state). Where criteria can be tested, links to STAR techniques are provided as citations.