Open iansan5653 opened 4 months ago
I had the exact same idea today and am glad to find your thorough writeup of it.
To me the TL;DR is: While a picture is worth a thousand words, there is still value in generating the thousand words as well.
A non navigatable description could also be helpful for usecases where the final diagram is exported as png. In this case the description could start with the most important parts of the diagram and go into more details later. E.g.:
I will follow this conversation and would like to help implement it.
@iansan5653 thank you for explaining the issue and all the possible solutions. I don't have much experience with accessibility, but genuinely interested in helping wherever possible! I don't have thoughts on the problems/possible solutions, but will add some points we should consider while exploring our option.
If we can decide on an approach and if you are open to external contributions, this might be something I'd be willing to work on if I can find the time.
We'd love to have contributions from the community, especially from an A11Y expert like you!
A non navigable description could also be helpful for usecases where the final diagram is exported as png.
This is a great point. Is rendering as a png something that Mermaid natively supports? Or are you thinking more about users exporting screenshots of diagrams?
Our system should be easy enough to allow people to add new diagrams without much trouble
:100: I absolutely agree. I think the ideal solution is one that makes all existing and future diagrams accessible without any extra effort or configuration on the consumer's part.
How we handle SVGs that are not generated inside mermaid. (d3-sankey comes to mind)
Thanks for pointing this out! I hadn't realized there are diagrams for which Mermaid doesn't have complete control over the output. Do you know if there are any other diagram types like this?
If we were to decide to go with the "Accessible SVG markup" option, this is definitely an important consideration. We'd probably have to commit to also improving the output of the upstream libraries, which would certainly expand the scope of the work even more.
Is rendering as a png something that Mermaid natively supports?
No, mermaid only supports SVG output (for now). PNGs are generated by tools that render the SVG on a canvas to export them.
Do you know if there are any other diagram types like this?
ZenUML is one. Actually, in Sankey and Pie, although we use the d3-pie & d3-sankey packages, the final SVG is created inside mermaid itself. So we actually have complete control over the output. I just checked the internals.
Hi, I want to kick off a discussion on how we might solve screen reader support in Mermaid diagrams. https://github.com/mermaid-js/mermaid/issues/2395 discusses this a bit for flowcharts, but this affects all diagram types and could use a closer look. I considered opening this as a discussion but it looks like issues tend to be more active in this project.
If we can decide on an approach and if you are open to external contributions, this might be something I'd be willing to work on if I can find the time.
Current state of things
Current support for accessibility is limited to setting the
aria-roledescription
and, if users provide it, thetitle
(accTitle
) anddesc
(accDescr
) of thesvg
element. Theroledescription
tells screen reader users what diagram type they are interacting with, but it doesn't help them actually understand the diagram (and thearia-roledescription
attribute usage may actually be problematic itself). The title and description can help users understand the diagram, but they are rarely set and have to be manually built by the consumer, meaning that all information in the diagram has to be duplicated into the description.The diagram content itself is wholly inaccessible to screen readers — it typically presents as near-nonsense. This is a shame as it means that nonsighted users cannot share in the rich diagramming experience that sighted users have access to. Mermaid diagrams can contain information that is critical to understanding the page they are on, so that information being inaccessible can sometimes make the entire page inaccessible.
Let's look at a simple demo. Here's a basic pie chart example:
Which renders as:
When a screen reader such as VoiceOver encounters this diagram, the accessibility software will iterate through the SVG nodes in DOM order and read out the text content of each one, resulting in the following output:
While all of the visible text is read to the user, it is read without any logical structure and therefore doesn't make any sense. If you handed me that block of text without any context, I don't think I'd ever come to the conclusions I could reach with a simple glance at the pie chart.
This is a difficult (but exciting!) problem to solve
Unfortunately, this is an inherently difficult problem to solve. While screen reader support for standard structured web content has a well-defined solution, the same cannot be said for SVGs (or diagrams in general). The crux of the problem is that the entire concept of a 'diagram' is visual. Take the definition provided by Wikipedia:
So the question is: how can we adapt visual content for users who cannot perceive visual content?
While I recognize that this is not going to be easy, I also believe that Mermaid is uniquely situated to solve this problem. The entire concept of Mermaid is that we can represent this visual content as plain structured text. Mermaid diagrams aren't just static images generated in Photoshop; they are rendered dynamically from structured data. The renderer understands each part of the diagram and knows how different components are related, independent of their visual representation.
This is a critical concept because it means there's hope. We are already transforming text into visuals — with some careful effort, there's no reason we can't also turn that text into something a screen reader can parse.
And this hope is exciting! If this problem could be solved, we could provide a diagramming solution that has accessibility built in by default. We'd instantly improve the quality of life of millions of users, with (ideally) no extra effort required on the part of the diagram creators.
A few potential solutions
I can think of a feww different paths we could take here:
Alternative plain text
We could just treat the diagram like we would a static
img
tag, by providing detailed alternative text for it and preventing screen readers from navigating the individual DOM nodes.The naivest (and easiest) approach here would be to just provide the raw Mermaid source code as the alt text, since it's designed to be relatively human readable. While that would probably be a better experience than what we currently have, I think it's still not great because that language is still not designed to be read aloud, and the raw data can differ from what's displayed. You can already see in the above pie chart example that the source code has raw numbers and the output diagram has percentages. It's best to present the same content to all users.
Instead, we could build a better description. For example, for the above diagram we could build this textual representation:
This approach would provide a pretty good experience in many cases, though it would get unwieldy fast in more complicated diagrams. Take a large flowchart for example, and imagine someone reading aloud a description of that flowchart. It would be pretty difficult to build up a mental model of the chart, especially when you consider branches and loops.
Another problem with this approach is the presence of interactive nodes in flowcharts. For example, how would a screen reader user click a hyperlink inside a node in this scenario? Fortunately, Mermaid diagrams are usually not interactive so this isn't a major concern, but the hyperlink problem in particular does present a challenge.
Alternative HTML markup
The alternative text doesn't have to be plain text. We could hide the diagram from screen readers and instead provide a fully custom HTML representation of the diagram that is visually hidden from sighted users. This could take advantage of the full suite of available semantic HTML elements to represent data in a more accessible shape that depends on the diagram.
For example, the data in the pie chart above could be represented by a table:
This use of a
table
provides screen reader users with the ability to not only read the data, but also navigate through it step by step. However, this pie chart is an unusually convenient example; most diagrams are not generated from tabular data. Converting a flowchart to semantic HTML, for example, could be much more difficult. However, a complex flowchart represented in HTML would still be much easier to navigate than one represented by plain text.Accessible SVG markup
Finally, we can try to reuse the existing SVG markup by making it accessible. Generally, this is the preferred approach to making web content accessible, because it means that the content that non-sighted users experience is as close as possible to the content that sighted users experience. However, with SVG this is much easier said than done, as SVG elements generally have no semantic meaning by default - they appear in the accessibility tree as plain text nodes.
To make Mermaid's SVG output accessible, each component in the diagram would need some sort of accessible name and description. There is no ARIA specification for diagramming, so the components would need to describe their own roles and relationships. For example, when a user navigates to a flowchart node they would need to be able to determine several things:
This is a lot of information; finding a way to represent it all accessibly could be challenging.
Another important problem to solve here is that screen readers and other accessibility technologies are dependent upon the order of elements in the DOM tree. The pie chart example reads out of order because SVG doesn't really care about element order, so Mermaid thus far hasn't paid much attention to it. This means that this solution would likely require significant changes to the SVG output.
Finally, the most challenging problem here is probably navigating through diagrams. While putting elements in the correct order will help with this, most diagrams won't be best navigated in one particular order. When focused on a flowchart node, for example, there could be two or more logical 'next nodes' or 'previous nodes'. This likely means that each diagram component would need to be focusable via the keyboard, and custom keyboard shortcuts would need to be built for navigating each type of diagram. This opens up a whole can of worms around accessible keyboard shortcuts and discoverability, but if we could get it right it would definitely provide a really great experience.
Because of these problems an example for this is much harder to provide, but here's what a pie chart might look like (disclaimer: this is very simplified and I am by no means an expert):