Closed rviscomi closed 3 years ago
@scottdavis99 @ericwbailey @oluoluoxenfree @alextait1 @schachin Moved scheduling this chat to an email I just sent. Faster to go back and forth with times there.
@alextait1 can you edit the top comment and put @schachin in the proper role?
@schachin Can you shoot an email to david@davidjfox.com? So i can add you to the email thread i just sent.
Sorry I missed this -- will do what you requested now :)
More Info & Publications
Contact Info
Helping You Make It Better by Making It Work. Recommendations and References available on LinkedIn or by request. Client information is generally protected by NDA and not typically available on public sites.
On Wednesday, May 19, 2021, 6:12:36 AM PDT, David Fox ***@***.***> wrote:
@scottdavis99 @ericwbailey @oluoluoxenfree @alextait1 @schachin Moved scheduling this chat to an email I just sent. Faster to go back and forth with times there.
@alextait1 can you edit the top comment and put @schachin in the proper role?
@schachin Can you shoot an email to @.***? So i can add you to the email thread i just sent.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.
Hi! Question. (not sure if you are the right one to send this to) Is there a way to also be a reviewer on the SEO Section -- there is so much bad info out there right now my fear is it will get repeated in the SEO section.
To note... I have 16 years SEO exp 20 years web (design and dev) Written over 125 articles on Search and GoogleSpoken at over 80 conferences in the US and Internationally including SXSWi Our industry just suffers from a lot of people competing to be rockstars right now and so they put out info that is not well researched and I just want to review to make sure we do not hurt site owners with that making it into this.
Because in the end SEO done wrong can put a company out of business.
Thank you! Kristine
More Info & Publications
Contact Info
Helping You Make It Better by Making It Work. Recommendations and References available on LinkedIn or by request. Client information is generally protected by NDA and not typically available on public sites.
On Wednesday, May 12, 2021, 2:47:05 PM PDT, Rick Viscomi ***@***.***> wrote:
@schachin oh I think if you're only reading the email thread you don't have the full context. Visit #2147 (comment) to see the whole GitHub issue; the top comment explains how to contribute in more detail. And here's the post that kicks off the project. For context I got your info from the 2021 Web Almanac interest form where you indicated that you're a SME in accessibility and open to authoring or reviewing. You also mentioned SEO, but that chapter has more than enough contributors at the moment!
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.
@schachin That'd be a question for the SEO team lead Patrick Stox. Here's their GitHub tracking issue where you can ask: https://github.com/HTTPArchive/almanac.httparchive.org/issues/2146
Looks like they may be full with reviewers (13 signed up already). But can always reach out and ask 😃
Oh cool I know Patrick so will ask him directly at least I feel better knowing he is running that.
More Info & Publications
Contact Info
Helping You Make It Better by Making It Work. Recommendations and References available on LinkedIn or by request. Client information is generally protected by NDA and not typically available on public sites.
On Sunday, May 23, 2021, 11:19:58 AM PDT, David Fox ***@***.***> wrote:
@schachin That'd be a question for the SEO team lead Patrick Stox. Here's their GitHub tracking issue where you can ask: #2146
Looks like they may be full with reviewers (13 signed up already). But can always reach out and ask 😃
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.
@alextait1 @scottdavis99 @oluoluoxenfree @ericwbailey @clottman @shantsis @digitala11ies
Hey everyone, looks like there's been a lot of progress on creating the chapter outline which is awesome!
We should be in great shape to complete it by June 15 so we have enough time to update our crawler with any additional metrics you need this year.
Also a reminder that the team has a channel on slack (#web-almanac-a11y
), so feel free to join the discussion there as well: https://join.slack.com/t/httparchive/shared_invite/zt-45sgwmnb-eDEatOhqssqNAKxxOSLAaA
If you have any other questions don't hesitate to reach out :)
Hi @alextait1 and team,
Just wanted to let you know what we had added detection for various accessibility overlay type solutions Wappalyzer last year (https://github.com/AliasIO/wappalyzer/issues/3228). If you know of any more and want reported, you should extend Wappalyzer detection logic.
Cheers, Rockey
@rockeynebhwani thanks so much we do plan to talk about overlays so I'll check it out!
@alextait1 @scottdavis99 @oluoluoxenfree @ericwbailey @clottman @shantsis @digitala11ies the high level overview is up and ready for review and additions from authors. I'm hoping to start going a level deeper towards the metrics this coming week so take a peak if you can, and authors especially want to be sure you've weighed in 🙏🏼
@rockeynebhwani thanks so much we do plan to talk about overlays so I'll check it out!
@alextait1 - FYI.. I briefly covered this in eCommerce chapter last year - https://almanac.httparchive.org/en/2020/ecommerce#accessibility-solutions
@rockeynebhwani thanks so much we do plan to talk about overlays so I'll check it out!
@alextait1 - FYI.. I briefly covered this in eCommerce chapter last year - https://almanac.httparchive.org/en/2020/ecommerce#accessibility-solutions
Yes we touched on it as well but in a more qualitative way, not with metrics. We're going to do a deeper dive this year for sure. Thank you for the support!
@scottdavis99 @oluoluoxenfree @ericwbailey @clottman @shantsis @digitala11ies @obto
Thanks for the comments on the doc, I've made some updates. Authors - I hope you've had a chance to take a look, didn't see any comments from you so hopefully no news is good news 😂
And with that milestone 0 and 1 are checked off!
David - if the metrics requests need clarifying let us know
I was reminded again of the fact that sites which use ARIA often have more accessibility issues than sites that don't. I've never particularly liked that stat as to correlation does not apply causation. If you are a really complex site and use a Bootstrap component with one ARIA label (even if valid use of that), but lots of other errors due to the fact it's a complex site, then suddenly you're dragging down the stats - but more because you're a complex site, than because of use of ARIA.
I think this originally came from the WebAIM survey which does clarify it more:
ARIA correlated to higher detectable errors. The more ARIA attributes that were present, the more detectable accessibility errors could be expected. This does not necessarily mean that ARIA introduced these errors (these pages are more complex), but pages typically had more errors when ARIA was present.
The second part of that tends to be forgotten (it doesn't help that the first part is in bold!). While "no ARIA is better than bad ARIA", I wonder should we dig into it more?
Should we consider looking at slicing and dicing how different sites/technologies are accessible? I'm thinking we could use the Lighthouse Accessibility score and measure how this changes depending on:
.gov
in the URL as many government sites have higher accessibility standards (USA and UK for one I think, but you might know better than me).I know Lighthouse isn't perfect and 100 Accessibility scores doesn't imply a perfectly accessible site, but < 100 strongly implies a less than accessible site and as a simple measure of "how accessible a website is" I think it's a decent enough metric to report on the above to give a high level summary.
I don't think we'd need any new custom metrics for this as we have all of the above info. Just a matter of writing some more queries based on the available data (you're welcome @obto! 😁) . And if the data show nothing of interest and no correlation then we just drop it.
Interested to hear your thoughts on whether this is worthwhile and if you have any other stats to add to above?
While I am personally super curious about some of your ideas about how to peer into this data further, I'm unsure if clarification would do what we intend it to. I would hate to inadvertently communicate that it's acceptable to use ARIA not as a last resort and without testing.
I oftentimes find that presenting this kind of info gives someone who has already made up their mind, or who has already committed code a way to back up their decision.
I'm not talking about using this to justify the use of ARIA (though I admit it was that quote that started off my thinking here), but more drilling into what types of pages/sites are more or less accessible. ARIA use is just one measure of that (and a bad one IMHO for the reasons I gave above).
Actually I see some of the ideas have been covered by WebAIM Million report: https://webaim.org/projects/million/#technologies
Would definitely be interested in seeing different rates of accessibility in top 1000, 10k, 100k, million and all websites.
And the great thing about the Web Almanac is we don't just present the stats but have experts giving their interpretation of what that means. So can hopefully at least somewhat address your concerns with that.
OK curiosity got the better of me so I ran some stats:
category | percentile | all_sites | uses_aria | accessibe | top1k | tok10k | top100k | tok1m |
---|---|---|---|---|---|---|---|---|
num_sites | 7,150,239 | 5,134,088 | 417 | 863 | 7,768 | 79,014 | 782,451 | |
num_sites_pct | 100% | 72% | 0.006% | 0.012% | 0.109% | 1.105% | 10.943% | |
percentile | 10 | 0.61 | 0.7 | 0.7 | 0.6 | 0.6 | 0.61 | 0.61 |
percentile | 25 | 0.73 | 0.79 | 0.77 | 0.72 | 0.71 | 0.72 | 0.73 |
percentile | 50 | 0.83 | 0.86 | 0.84 | 0.82 | 0.81 | 0.82 | 0.83 |
percentile | 75 | 0.91 | 0.92 | 0.91 | 0.91 | 0.9 | 0.9 | 0.9 |
percentile | 90 | 0.96 | 0.96 | 0.96 | 0.97 | 0.96 | 0.96 | 0.96 |
percentile | 95 | 0.97 | 0.98 | 0.97 | 0.98 | 0.98 | 0.98 | 0.98 |
percentile | 99 | 1 | 1 | 0.98 | 1 | 1 | 1 | 1 |
Here's some takeaways I see from this:
Of course it should be remembered that Lighthouse Accessibility checks are limited and a high score does not indicate a site is accessible – though I do usually find the opposite is true (i.e. a low score indicates a site is usually at least partially inaccessible), so think there is still value in looking at this as a broad indicator of how accessible/inaccessible a website is when dealing at the scale we deal with.
Anyway, satisfied my own curiosity and so will bow out again now and leave the chapter team to decide if they want to include any of this type of info in the chapter.
SQL Query below. It uses 3TB at a cost of $15 and takes a good 15 mins to run! - I'm sure it can be improved but just something I knocked together to see if this was worth exploring further.
#standardSQL
CREATE TEMPORARY FUNCTION usesAriaAttributes(payload STRING)
RETURNS BOOL LANGUAGE js AS '''
try {
const almanac = JSON.parse(payload);
const containsAria = (element) => element.includes('aria') === true;
return Object.keys(almanac.attributes_used_on_elements).some(containsAria)
} catch (e) {
return false
}
''';
WITH lighthouse_scores AS (
SELECT url,
CAST(JSON_EXTRACT(report, '$.categories.accessibility.score') AS NUMERIC) AS accessibility
FROM
#`httparchive.sample_data.lighthouse_mobile_10k`
`httparchive.lighthouse.2021_05_01_mobile`
WHERE JSON_EXTRACT(report, '$.categories.accessibility.score') IS NOT NULL
),
all_sites AS (
SELECT
COUNT(0) AS all_sites_num_sites,
percentile,
APPROX_QUANTILES(accessibility, 1000)[OFFSET(percentile * 10)] AS all_sites_score
FROM (
SELECT
accessibility
FROM
lighthouse_scores),
UNNEST([10, 25, 50, 75, 90, 95, 99]) AS percentile
GROUP BY
percentile
),
uses_aria AS (
SELECT
COUNT(0) AS uses_aria_num_sites,
percentile,
APPROX_QUANTILES(accessibility, 1000)[OFFSET(percentile * 10)] AS uses_aria_score
FROM (
SELECT
accessibility
FROM
lighthouse_scores
JOIN
#`httparchive.sample_data.pages_mobile_10k`
`httparchive.pages.2021_05_01_mobile`
USING (url)
WHERE
usesAriaAttributes(JSON_EXTRACT_SCALAR(payload, '$._almanac'))
),
UNNEST([10, 25, 50, 75, 90, 95, 99]) AS percentile
GROUP BY
percentile
),
uses_accessibe AS (
SELECT
COUNT(0) AS uses_accessibe_num_sites,
percentile,
APPROX_QUANTILES(accessibility, 1000)[OFFSET(percentile * 10)] AS uses_accessibe_score
FROM (
SELECT
accessibility
FROM
lighthouse_scores
JOIN
#`httparchive.sample_data.technologies_mobile_10k`
`httparchive.technologies.2021_05_01_mobile`
USING (url)
WHERE
category = 'Accessibility' AND
APP = 'AccessiBe'
),
UNNEST([10, 25, 50, 75, 90, 95, 99]) AS percentile
GROUP BY
percentile
),
ranking AS (
SELECT DISTINCT
origin || '/' AS url,
experimental.popularity.rank AS rank
FROM
`chrome-ux-report.all.202105`
),
top1k AS (
SELECT
COUNT(0) AS top1k_num_sites,
percentile,
APPROX_QUANTILES(accessibility, 1000)[OFFSET(percentile * 10)] AS top1k_score
FROM (
SELECT
accessibility
FROM
lighthouse_scores
JOIN
ranking
USING (url)
WHERE
rank <= 1000
),
UNNEST([10, 25, 50, 75, 90, 95, 99]) AS percentile
GROUP BY
percentile
),
top10k AS (
SELECT
COUNT(0) AS top10k_num_sites,
percentile,
APPROX_QUANTILES(accessibility, 1000)[OFFSET(percentile * 10)] AS top10k_score
FROM (
SELECT
accessibility
FROM
lighthouse_scores
JOIN
ranking
USING (url)
WHERE
rank > 1000 AND
rank <= 10000
),
UNNEST([10, 25, 50, 75, 90, 95, 99]) AS percentile
GROUP BY
percentile
),
top100k AS (
SELECT
COUNT(0) AS top100k_num_sites,
percentile,
APPROX_QUANTILES(accessibility, 1000)[OFFSET(percentile * 10)] AS top100k_score
FROM (
SELECT
accessibility
FROM
lighthouse_scores
JOIN
ranking
USING (url)
WHERE
rank > 10000 AND
rank <= 100000
),
UNNEST([10, 25, 50, 75, 90, 95, 99]) AS percentile
GROUP BY
percentile
),
top1m AS (
SELECT
COUNT(0) AS top1m_num_sites,
percentile,
APPROX_QUANTILES(accessibility, 1000)[OFFSET(percentile * 10)] AS top1m_score
FROM (
SELECT
accessibility
FROM
lighthouse_scores
JOIN
ranking
USING (url)
WHERE
rank > 100000 AND
rank <= 1000000
),
UNNEST([10, 25, 50, 75, 90, 95, 99]) AS percentile
GROUP BY
percentile
),
results AS (
SELECT
all_sites_num_sites,
uses_aria_num_sites,
uses_accessibe_num_sites,
top1k_num_sites,
top10k_num_sites,
top100k_num_sites,
top1m_num_sites,
percentile,
all_sites_score,
uses_aria_score,
uses_accessibe_score,
top1k_score,
top10k_score,
top100k_score,
top1m_score
FROM
all_sites
JOIN
uses_aria
USING
(percentile)
JOIN
uses_accessibe
USING
(percentile)
JOIN
top1k
USING
(percentile)
JOIN
top10k
USING
(percentile)
JOIN
top100k
USING
(percentile)
JOIN
top1m
USING
(percentile)
)
SELECT
'num_sites' as category,
NULL as percentile,
MAX(all_sites_num_sites) AS all_sites,
MAX(uses_aria_num_sites) AS uses_aria,
MAX(uses_accessibe_num_sites) AS accessibe,
MAX(top1k_num_sites) AS top1k,
MAX(top10k_num_sites) AS tok10k,
MAX(top100k_num_sites) AS top100k,
MAX(top1m_num_sites) AS tok1m
FROM
results
UNION ALL
SELECT
'num_sites_pct' as category,
NULL as percentile,
MAX(all_sites_num_sites)/MAX(all_sites_num_sites),
MAX(uses_aria_num_sites)/MAX(all_sites_num_sites),
MAX(uses_accessibe_num_sites)/MAX(all_sites_num_sites),
MAX(top1k_num_sites)/MAX(all_sites_num_sites),
MAX(top10k_num_sites)/MAX(all_sites_num_sites),
MAX(top100k_num_sites)/MAX(all_sites_num_sites),
MAX(top1m_num_sites)/MAX(all_sites_num_sites)
FROM
results
UNION ALL
SELECT
'percentile' as category,
percentile,
all_sites_score,
uses_aria_score,
uses_accessibe_score,
top1k_score,
top10k_score,
top100k_score,
top1m_score
FROM
results
ORDER BY
category,
percentile
Thanks for taking the time to do all that, Barry!
I'm definitely mega hesitant to associate ARIA usage being good vs bad. While only an anecdotal experience, I've faced a lot of people using properties like role="presentation", role="application" and aria-hidden="true" in really damaging ways but which was only discovered in manual testing, which makes me hesitant to comment on ARIA this way.
On Sat, Jun 19, 2021, 10:11 AM Barry Pollard @.***> wrote:
OK curiosity got the better of me so I ran some stats: category percentile all_sites uses_aria accessibe top1k tok10k top100k tok1m num_sites 7,150,239 5,134,088 417 863 7,768 79,014 782,451 num_sites_pct 100% 72% 0.006% 0.012% 0.109% 1.105% 10.943% percentile 10 0.61 0.7 0.7 0.6 0.6 0.61 0.61 percentile 25 0.73 0.79 0.77 0.72 0.71 0.72 0.73 percentile 50 0.83 0.86 0.84 0.82 0.81 0.82 0.83 percentile 75 0.91 0.92 0.91 0.91 0.9 0.9 0.9 percentile 90 0.96 0.96 0.96 0.97 0.96 0.96 0.96 percentile 95 0.97 0.98 0.97 0.98 0.98 0.98 0.98 percentile 99 1 1 0.98 1 1 1 1
Here's some takeaways I see from this:
- I don't see the same findings as WebAIM that accessibility is worse when ARIA is used – they tend to be slightly more accessible as far as the Lighthouse score measures this. And it IS used a lot - by 72% of our sites! This may be due to the limited audits that Lighthouse, and the underlying axe library, performs (I tend to find they tend to only use only audits that are less susceptible to noise).
- There is no "Accessibility Overlays" category in Wappalyzer but looking at accessiBe as a well-known one, it does seem to improve the Accessibility score for sites that use that at the lower percentiles (the easy wins?). As well as the other criticism the Accessibility community has for these, I think there's a genuine question if these are worth it if it only increased the 50th percentile by a single Lighthouse point? It should be noted though, that at 417 sites it's a VERY small sample size so not sure how much we can really read into that. Another interesting point though is even at 99th percentile they don't hit the top Lighthouse Accessibility score (which I honestly think is quite an achievable score!)
- Disappointingly there seems to be no correlation between site popularity and Lighthouse Accessibility score 😢 I had hoped that more popular sites, presumably with more resources to look after their website would have better scores. If anything the opposite appears to be true! Personally I think that in of itself is an interesting stat!
Of course it should be remembered that Lighthouse Accessibility checks are limited and a high score does not indicate a site is accessible – though I do usually find the opposite is true (i.e. a low score indicates a site is usually at least partially inaccessible), so think there is still value in looking at this as a broad indicator of how accessible/inaccessible a website is when dealing at the scale we deal with.
Anyway, satisfied my own curiosity and so will bow out again now and leave the chapter team to decide if they want to include any of this type of info in the chapter.
SQL Query below. It uses 3TB at a cost of $15 and takes a good 15 mins to run! - I'm sure it can be improved but just something I knocked together to see if this was worth exploring further.
standardSQL
CREATE TEMPORARY FUNCTION usesAriaAttributes(payload STRING)
RETURNS BOOL LANGUAGE js AS ''' try { const almanac = JSON.parse(payload); const containsAria = (element) => element.includes('aria') === true; return Object.keys(almanac.attributes_used_on_elements).some(containsAria) } catch (e) { return false } ''';
WITH lighthouse_scores AS (
SELECT url,
CAST(JSON_EXTRACT(report, '$.categories.accessibility.score') AS NUMERIC) AS accessibility
FROM
#`httparchive.sample_data.lighthouse_mobile_10k` `httparchive.lighthouse.2021_05_01_mobile`
WHERE JSON_EXTRACT(report, '$.categories.accessibility.score') IS NOT NULL
),
all_sites AS (
SELECT
COUNT(0) AS all_sites_num_sites, percentile, APPROX_QUANTILES(accessibility, 1000)[OFFSET(percentile * 10)] AS all_sites_score
FROM (
SELECT accessibility FROM lighthouse_scores), UNNEST([10, 25, 50, 75, 90, 95, 99]) AS percentile
GROUP BY
percentile
),
uses_aria AS (
SELECT
COUNT(0) AS uses_aria_num_sites, percentile, APPROX_QUANTILES(accessibility, 1000)[OFFSET(percentile * 10)] AS uses_aria_score
FROM (
SELECT accessibility FROM lighthouse_scores JOIN #`httparchive.sample_data.pages_mobile_10k` `httparchive.pages.2021_05_01_mobile` USING (url) WHERE usesAriaAttributes(JSON_EXTRACT_SCALAR(payload, '$._almanac')) ), UNNEST([10, 25, 50, 75, 90, 95, 99]) AS percentile
GROUP BY
percentile
),
uses_accessibe AS (
SELECT
COUNT(0) AS uses_accessibe_num_sites, percentile, APPROX_QUANTILES(accessibility, 1000)[OFFSET(percentile * 10)] AS uses_accessibe_score
FROM (
SELECT accessibility FROM lighthouse_scores JOIN #`httparchive.sample_data.technologies_mobile_10k` `httparchive.technologies.2021_05_01_mobile` USING (url) WHERE category = 'Accessibility' AND APP = 'AccessiBe' ), UNNEST([10, 25, 50, 75, 90, 95, 99]) AS percentile
GROUP BY
percentile
),
ranking AS (
SELECT DISTINCT
origin || '/' AS url, experimental.popularity.rank AS rank
FROM
`chrome-ux-report.all.202105`
),
top1k AS (
SELECT
COUNT(0) AS top1k_num_sites, percentile, APPROX_QUANTILES(accessibility, 1000)[OFFSET(percentile * 10)] AS top1k_score
FROM (
SELECT accessibility FROM lighthouse_scores JOIN ranking USING (url) WHERE rank <= 1000 ), UNNEST([10, 25, 50, 75, 90, 95, 99]) AS percentile
GROUP BY
percentile
),
top10k AS (
SELECT
COUNT(0) AS top10k_num_sites, percentile, APPROX_QUANTILES(accessibility, 1000)[OFFSET(percentile * 10)] AS top10k_score
FROM (
SELECT accessibility FROM lighthouse_scores JOIN ranking USING (url) WHERE rank > 1000 AND rank <= 10000 ), UNNEST([10, 25, 50, 75, 90, 95, 99]) AS percentile
GROUP BY
percentile
),
top100k AS (
SELECT
COUNT(0) AS top100k_num_sites, percentile, APPROX_QUANTILES(accessibility, 1000)[OFFSET(percentile * 10)] AS top100k_score
FROM (
SELECT accessibility FROM lighthouse_scores JOIN ranking USING (url) WHERE rank > 10000 AND rank <= 100000 ), UNNEST([10, 25, 50, 75, 90, 95, 99]) AS percentile
GROUP BY
percentile
),
top1m AS (
SELECT
COUNT(0) AS top1m_num_sites, percentile, APPROX_QUANTILES(accessibility, 1000)[OFFSET(percentile * 10)] AS top1m_score
FROM (
SELECT accessibility FROM lighthouse_scores JOIN ranking USING (url) WHERE rank > 100000 AND rank <= 1000000 ), UNNEST([10, 25, 50, 75, 90, 95, 99]) AS percentile
GROUP BY
percentile
),
results AS ( SELECT
all_sites_num_sites,
uses_aria_num_sites,
uses_accessibe_num_sites,
top1k_num_sites,
top10k_num_sites,
top100k_num_sites,
top1m_num_sites,
percentile,
all_sites_score,
uses_aria_score,
uses_accessibe_score,
top1k_score,
top10k_score,
top100k_score,
top1m_score FROM
all_sites JOIN
uses_aria
USING
(percentile) JOIN
uses_accessibe
USING
(percentile) JOIN
top1k
USING
(percentile) JOIN
top10k
USING
(percentile) JOIN
top100k
USING
(percentile) JOIN
top1m
USING
(percentile)
)
SELECT
'num_sites' as category,
NULL as percentile,
MAX(all_sites_num_sites) AS all_sites,
MAX(uses_aria_num_sites) AS uses_aria,
MAX(uses_accessibe_num_sites) AS accessibe,
MAX(top1k_num_sites) AS top1k,
MAX(top10k_num_sites) AS tok10k,
MAX(top100k_num_sites) AS top100k,
MAX(top1m_num_sites) AS tok1m FROM
results UNION ALL SELECT
'num_sites_pct' as category,
NULL as percentile,
MAX(all_sites_num_sites)/MAX(all_sites_num_sites),
MAX(uses_aria_num_sites)/MAX(all_sites_num_sites),
MAX(uses_accessibe_num_sites)/MAX(all_sites_num_sites),
MAX(top1k_num_sites)/MAX(all_sites_num_sites),
MAX(top10k_num_sites)/MAX(all_sites_num_sites),
MAX(top100k_num_sites)/MAX(all_sites_num_sites),
MAX(top1m_num_sites)/MAX(all_sites_num_sites) FROM
results UNION ALL SELECT
'percentile' as category,
percentile,
all_sites_score,
uses_aria_score,
uses_accessibe_score,
top1k_score,
top10k_score,
top100k_score,
top1m_score FROM
results ORDER BY
category,
percentile
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/HTTPArchive/almanac.httparchive.org/issues/2147#issuecomment-864411604, or unsubscribe https://github.com/notifications/unsubscribe-auth/ARTAXE7BK54SA7A7UZNSK2DTTSQRTANCNFSM43UFMWSA .
Yeah as I say I’m not saying ARIA means it’s good and there’s definitely pitfalls to it and shouldn’t be used unless necessary. I just didn’t like that statement’s implication that it’s bad either (without delving further into why their stats showed that), and thought worth digging into it more to see if we could replicate their findings and dig into why that was the case more.
ARIA is necessary in a lot of cases! But it’s a tool like any other so needs to be used in right way and can be used in wrong way.
But ultimately with these stats I’m not trying to recommend anything - I’m just trying to report on the state of the web and see what it tells us.
Many websites are simple and don’t require ARIA. Many are complicated and so do. And more complicated websites are way more likely to have at least one accessibility issue than simple ones. So in many ways not surprising that usage of ARIA would lead to at least one accessibility issue more often than those sites that don’t use it.
But still I think it’s interesting that 72% of websites have at least one ARIA attribute (way more than I thought would!) and also that, by Lighthouse Accessibility score at least, those websites do tend to look to be more accessible.
I'm going to think on this more but I tend to agree with @digitala11ies, I think it's risky business equating the presence ARIA with more or less over-all accessibility as it's so often misused and I wouldn't want anyone to take away "use more ARIA" from our report, especially since that's in conflict with the first rule of ARIA. I do think it's interesting to highlight the high rate of ARIA use, shows that people are at least considering whether they should or leveraging libraries with ARIA incorporated.
Yes - thanks for putting that more eloquently than I could manage, Alex! I think there's still value in talking about it --I just want to be careful in the associations we make.
On Sat, Jun 19, 2021, 11:55 AM Alex Tait @.***> wrote:
I'm going to think on this more but I tend to agree with @digitala11ies https://github.com/digitala11ies, I think it's risky business equating the presence ARIA with more or less over-all accessibility as it's so often misused and I wouldn't want anyone to take away "use more ARIA" from our report, especially since that's in conflict with the first rule of ARIA https://www.w3.org/TR/using-aria/#rule1. I do think it's interesting to highlight the high rate of ARIA use, shows that people are at least considering whether they should or leveraging libraries with ARIA incorporated.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/HTTPArchive/almanac.httparchive.org/issues/2147#issuecomment-864423829, or unsubscribe https://github.com/notifications/unsubscribe-auth/ARTAXEYPHQ4ZCNOCZRSQF5DTTS4V5ANCNFSM43UFMWSA .
Yup that's fair enough. Understand your concerns.
But what do you think about slicing and dicing the "accessibility score" in other ways? Or do you have similar concerns about generalising there? I thought the lack of correlation in site popularity was interesting for example.
And I reran the stats for *.gov.uk/
URLS, *.gov/
URLs and any URL with (.gov.
or .gov/
) in them and certainly looks like the UK and US government sites are doing a better job than the majority of the web (yeaah!):
category | percentile | all_sites | uk_gov | us_gov | all_gov |
---|---|---|---|---|---|
num_sites | 7,150,239 | 2,569 | 13,612 | 71,744 | |
num_sites_pct | 100% | 0.036% | 0.190% | 1.003% | |
percentile | 10 | 0.61 | 0.81 | 0.7 | 0.6 |
percentile | 25 | 0.73 | 0.88 | 0.81 | 0.73 |
percentile | 50 | 0.83 | 0.96 | 0.89 | 0.83 |
percentile | 75 | 0.91 | 0.99 | 0.95 | 0.91 |
percentile | 90 | 0.96 | 1 | 0.98 | 0.97 |
percentile | 95 | 0.97 | 1 | 1 | 0.98 |
percentile | 99 | 1 | 1 | 1 | 1 |
Depressingly however, more general .gov
websites just mirror the whole dataset so no better than average 😞
Anyway, if you think there's any merit in this approach, then have a think if there's any other sort of slicing and dicing you think we could do here to reveal interesting insights. And, as always, your expertise in adding colour to what any of the stats show is important here.
@tunetheweb ooo that's very interesting data about the government sites, I do want to think more on this! Thanks for surfacing these ideas 😎
I was part of the lead team for the GSA back in 2011-2013 -- goal was to add WCAG to all GSA controlled government sites but they had a lot of work to do on just getting them all up to spec first. USA.gov though was the lead site and it was WCAG back then, BEFORE they required it at the Federal level. So the government sites have paid a lot more attention at a Federal level at least.
More Info & Publications
Contact Info
Helping You Make It Better by Making It Work. Recommendations and References available on LinkedIn or by request. Client information is generally protected by NDA and not typically available on public sites.
On Saturday, June 19, 2021, 9:31:58 AM PDT, Alex Tait ***@***.***> wrote:
@tunetheweb ooo that's very interesting data about the government sites, I do want to think more on this! Thanks for surfacing these ideas 😎
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.
@alextait1
Data for all but 4 queries (the ones looking into the CSS) are completed and have their data input in our sheet: https://docs.google.com/spreadsheets/d/1WjAM5ZnHjMQt-rKyHvj2eVhU_WdzzFTjpoYWMr_I0Cw/edit#gid=150155313
Visualizations will be added next week along with comments for how to read the data.
The explanations for 90% of the queries are the same as last years, so please refer to last years sheet for explanations until I'm able to add them directly to our 2021 sheet.
@alextait1 The spreadsheet has been updated with comments. Let me know if you have any questions!
@obto thanks so much, I'll be taking a look on Friday this week!
@alextait1 @scottdavis99 @oluoluoxenfree @ericwbailey @clottman @shantsis @digitala11ies @obto
🎉 This chapter is fully written, reviewed, edited, and ready to be launched on Wednesday! Thank you to all of the contributors who put in the time and effort to make this a great chapter.
When you get 5 minutes, I'd really appreciate if you could fill out our contributor survey to tell us (the project leads) about your experience. It's super helpful to hear what went well or what could be improved for next time. 🙏
Congratulations and thank you all again. I'm excited for this to launch soon!
Part II Chapter 9: Accessibility
If you're interested in contributing to the Accessibility chapter of the 2021 Web Almanac, please reply to this issue and indicate which role or roles best fit your interest and availability: author, reviewer, analyst, and/or editor.
Content team
Expand for more information about each role
- The **[content team lead](https://github.com/HTTPArchive/almanac.httparchive.org/wiki/Content-Team-Leads'-Guide)** is the chapter owner and responsible for setting the scope of the chapter and managing contributors' day-to-day progress. - **[Authors](https://github.com/HTTPArchive/almanac.httparchive.org/wiki/Authors'-Guide)** are subject matter experts and lead the content direction for each chapter. Chapters typically have one or two authors. Authors are responsible for planning the outline of the chapter, analyzing stats and trends, and writing the annual report. - **[Reviewers](https://github.com/HTTPArchive/almanac.httparchive.org/wiki/Reviewers'-Guide)** are also subject matter experts and assist authors with technical reviews during the planning, analyzing, and writing phases. - **[Analysts](https://github.com/HTTPArchive/almanac.httparchive.org/wiki/Analysts'-Guide)** are responsible for researching the stats and trends used throughout the Almanac. Analysts work closely with authors and reviewers during the planning phase to give direction on the types of stats that are possible from the dataset, and during the analyzing/writing phases to ensure that the stats are used correctly. - **[Editors](https://github.com/HTTPArchive/almanac.httparchive.org/wiki/Editors'-Guide)** are technical writers who have a penchant for both technical and non-technical content correctness. Editors have a mastery of the English language and work closely with authors to help wordsmith content and ensure that everything fits together as a cohesive unit. - The **[section coordinator](https://github.com/HTTPArchive/almanac.httparchive.org/wiki/Section-Leads'-Guide)** is the overall owner for all chapters within a section like "User Experience" or "Page Content" and helps to keep each chapter on schedule. _Note: The time commitment for each role varies by the chapter's scope and complexity as well as the number of contributors._ For an overview of how the roles work together at each phase of the project, see the [Chapter Lifecycle](https://github.com/HTTPArchive/almanac.httparchive.org/wiki/Chapter-Lifecycle) doc.Milestone checklist
0. Form the content team
1. Plan content
2. Gather data
3. Validate results
4. Draft content
5. Publication
Chapter resources
Refer to these 2021 Accessibility resources throughout the content creation process:
📄 Google Docs for outlining and drafting content 🔍 SQL files for committing the queries used during analysis 📊 Google Sheets for saving the results of queries 📝 Markdown file for publishing content and managing public metadata