ARIA Data Visualisation module

w3c / aria

Accessible Rich Internet Applications (WAI-ARIA)

https://w3c.github.io/aria/

Other

630 stars 118 forks source link

ARIA Data Visualisation module #991

Open LJWatson opened 5 years ago

LJWatson commented 5 years ago

The accessible data visualisation use case keeps coming up. I wonder if it's time to revisit the idea of an ARIA Data Visualisation module?

We talked about it in the early days of the ARIA Graphics module, when the SVG A11y TF was still in existence. I don't recall now why it didn't progress, but perhaps @Shepazu, @AmeliaBr, or @Chaals will recall.

It could be argued that much can be accomplished with existing ARIA roles and attributes, but I think there are (at least) two reasons why that isn't the best solution:

Consistency of implementation. This is particularly important across educational materials. The last thing we'd want is for materials from one educational source to use different roles (with aria-roledescription), different navigation etc.
Navigation of data constructs. It's almost certain that we'll need to introduce ways of navigating constructs like Venn diagrams, scatter plots etc, or at least to have screen reader supported ways of navigating (as we do with tables for example).

monfera commented 5 years ago

A main tenet of data visualization is that the retina is anatomically part of the brain with low latency, high bandwidth communication, and there are pre-attentive visual attributes and gestalt principles that we use to, in effect, convey data such that

magnitudes, relations make almost immediate and intuitive sense
while saccadic eye movement are serial, they're also fast and frequent, leading to both a semi-guided (collaborative with the maker) path of experience - we jump to key thing that is highlighted - and also a fairly parallel access to all information

For a serial presentation, the optimizations for the visual channel are invalid, and due to different constraints, the audio-rendered result, ideally, ought to be of significantly or completely different structure, to maximize information intake, retention or whatever the composite goal is, while adhering to channel constraints.

This means that, for example, maybe the audio rendering of a line chart could be listened to better as a 7-panel small multiple chart ensemble of the 7 time series separately (longitudinal view), plus a 5-panel small multiple chart ensemble that gives a cross-sectional view (here the percentages don't add up to 100% due to overlapping screen reader use, but think of one pie chart per time point, though hearing "pie chart" may not add info compared to hearing "bar chart") where the user takes a top-level branch (cross-sectional vs longitudinal view) but can pivot to the other dimension any time.

It might be possible to annotate a multi-trace time series line chart such that the screen reader yields something like the above (or whatever alternative sequencing or navigational structure is deemed preferable by those of you who know).

Accessible data visualization is hindered by the fact that data visualization makers are almost exclusively vision-oriented folks with little to no understanding of how the varieties of screen readers work (like me) not to mention understanding how someone who doesn't see would be best served via a screen reading experience.

Then there are exploratory interactions, hard to serialize charts like Sankey or parcoords, then all the other channels and modalities, braille, maybe tactile screen displays ...

GordonSmith commented 5 years ago

IMO this topic is really the definition of an oxymoron.

As a "visualization developer" it is my job to take an underlying table of data (or tables in the case of a Graph or Heierarchy etc.) and present it visually in such a way that it can be consumed by the viewer without having to think about it.

To date I have avoided adding ARIA support at the visualization level and instead represented the underlying data as an old school html "table", ensuring:

The "table" is hidden to the "eye" but available to the screen reader.
The "visualization" is hidden to the screen reader.

When it comes to the "table" the focus becomes "how to articulate the data" in a way it can be quickly consumed by a listener (akin to an "elevator" pitch for a book or screenplay), which primarily falls into these common "TODOs":

Describing what the data table contains
Transforming the data so it reads better (if needed)
Ensuring it has intuitive interaction (if needed)

At which point you can treat your "table" (from a programming exercise) as just another "visualization", albeit one for your ears (and yes I have even used d3 to create these tables ).

It also means you can trivially show/hide the "ARIA view" of any given visualization / dashboard (hide the visualizations and display the previously hidden tables), which makes testing the given page a lot easier for the developer.

(not sure if that helps, just my 2c)

carmacleod commented 5 years ago

Hi @GordonSmith! Thank-you so much for commenting. I have a couple of questions and some notes.

Describing what the data table contains

Is the data you work with typically static or dynamic? If it's static, then describing it is easy, i.e. a human with an understanding of the data can sit down and write some analytical sentences about it.

If the data is dynamic, though, it can be really hard to describe because trends/correlations/inferences, etc. are not known ahead of time... they have to be discovered by looking at the data - and visualizations can really help with that discovery. If you typically describe dynamic data, then do you have, or know of, any Machine Learning/AI algorithms/techniques to analyze the raw data and come up with a decent description that effectively conveys what a sighted person could glean from a visualization? If so, can you share? Also, if you have this capability, I would encourage you to always provide the generated descriptive text visually, above or below the visualization, so that all users can benefit from the analysis.

The "table" is hidden to the "eye" but available to the screen reader.

Please allow the data table to be optionally displayed for the benefit of anyone who is curious about the data, or who wants to interpret the data in their own way.

The "visualization" is hidden to the screen reader.

Just curious if you typically provide hover or mouse click interaction in visualizations, perhaps to allow a user to drill down to see the detailed data in the absence of a visible table. If so, then please be aware that sighted users who can't use a mouse would also like to have that capability using the keyboard. Also, you might want to know that as soon as you have any event handling on the visualization, browsers and screen readers may ignore aria-hidden="true" and expose the visualization.

Transforming the data so it reads better (if needed) Ensuring it has intuitive interaction (if needed) ...you can treat your "table" (from a programming exercise) as just another "visualization"

That sounds interesting. Do you have an example that you can share?

GordonSmith commented 5 years ago

Yes it is dynamic - and for the description we do our "best" without actually interpreting the data. Unfortunately the description doesn't "paint the picture". (I really like the AI suggestion though) About the only trick we might do when transforming the data into the table, is to normalize (to percentages) or convert it to ratios and have the absolute min / max in the description (which is what a Pie chart is inherently doing visually).

I would encourage you to always provide the generated descriptive text visually

Agree and the description does often finds its way into the regular visualization (typically tooltip)

Please allow the data table to be optionally displayed for the benefit of anyone who is curious about the data, or who wants to interpret the data in their own way.

We do this as well - but its a different table (no interaction and very much like a raw data "excel" view)

please be aware that sighted users who can't use a mouse would also like to have that capability using the keyboard

This is a helpful reminder as it is an area we could improve in I suspect...

That sounds interesting. Do you have an example that you can share?

Nothing recent. But an early example did make it into a project on GitHub: https://github.com/GordonSmith/Visualization/blob/GH-1940-Aria/src/common/ARIATable.js (which is not a great way to create a HTML table - but the "d3" code style does fit in with the rest of the project)

carmacleod commented 5 years ago

Thanks for the example code, @GordonSmith! I tried to find a running example of the ARIATable, and couldn't. I did discover this great page of sample charts, graphs, and other visualizations, but I don't know where to search for the ARIATable. Can you describe how to find one?

AmeliaBR commented 5 years ago

I don't recall now why it didn't progress

We made the decision to pursue the “bare minimum” set of roles in the initial WAI-ARIA Graphics Module, focusing on roles that could be used as defaults for SVG elements.

But the plan was always to eventually add a more complex level 2! But with people leaving & the task force shutting down, that hasn't happened yet. So, thanks for bringing it up again, Léonie.

On the points brought up by @GordonSmith, my opinion is that it will be a lot easier to get right if we provide a way for authors to annotate the data in their data visualizations, in a way that makes semantic sense, without asking them to worry about how different accessibility tools communicate it. Then, the work to convert the visualization to a tabular or other format can be handled by the accessibility tool, based on a more nuanced understanding of the needs of the individual user.

Of course, authors could still provide alternative tabular representations or data files, because those are useful to all users. But they wouldn't need to make guesses about which parts are useful and which parts are redundant based on assumptions for one class of assistive tech users, that end up applying to all assistive tech users.

brennanyoung commented 3 years ago

A data table as alternative is very helpful, but only goes so far. Tables are typically poor at representing arbitrary relations between parts, such as the following:

supports/corresponds
contradicts/negates
is an outlier
is within/inside (a limit, set or boundary)
is beyond/outside (a limit, set or boundary)
is part of
relates to
includes
is (on) a boundary
matches
connects to
leads to
is correlated with
comes from
refers to
duplicates
is the same as
is opposite to
is similar to
is singled-out (different from "in focus", and also from "selected", because read-only/non-operable)
filters/reduces
generalizes/expands
is congruent with
is complete
is incomplete
is valid/approved
is "dirty"/unsaved
may be obsolete
has new content
is pronounced as
is a synonym of
is an antonym of
suggests/implies
is associated with

...etc. These are the kinds of relationships that communicate important details, and those details are made especially evident in visual form - the visual formatting tells "the real story". The important thing here is that these kinds of relationships are often not reflected in the DOM, so the existing semantic relationship mechanisms are of limited usefulness.

I don't have a great suggestion for handling these issues, apart from adding attributes, which will not be welcome. Perhaps some special tokens for aria-description. Maybe there are some enhancements tabled which address some of these cases.

Especially important is some idiom for browsing the parts. If it's just data (i.e. non-operable), the tab sequence is a poor fit. I suggest that we need some other way which content authors can use to facilitate the browsing according to known or preferred pathways - which may be separate or different from a grid or a tree, even if the data is ultimately structured as a grid or a tree.

I'd especially welcome a formal pattern, guideline or recommendation for how to associate a legend with a diagram or chart, since this is a common feature of a very large number of charts and diagrams. If the labeled datapoints could somehow inherit those individual legend descriptions automatically (via an AT shortcut, perhaps) it would be immensely helpful.

stes-acc commented 3 years ago

In SVG, data in graphs like bar charts can be mapped to focusable listitems with aria-label info browsable with arrow keys in list navigation style. And the chart container gets its role from WAI-ARIA Graphics Module.

Currently there is a filed issue with API mapping in the chromium engine prohibiting use of screen readers with SVG based charts.

brennanyoung commented 3 years ago

@stes-acc The chromium issue you linked (which I believe is not a bug) raises an important question: Should we use Tab to browse data points in whatever arbitrary arrangement? They are typically non-operable, so Tab seems like the wrong convention to adopt (IMO). Tab or not, we really need to an arrive at a standard UI pattern or convention for navigating data points, without being tied to tree and grid, which are often poorly-suited abstractions.

brennanyoung commented 3 years ago

More notes on legend

AT users should not be required to go hunting for the legend "somewhere" on the page. The association should be programmatically determined, so that ATs could provide shortcuts to go from a graphics-document to any associated legend.

Is something likerole="legend" an unwelcome idea? What existing role is suitable instead?

HTML has <legend> but uses it exclusively for onscreen labelling of <fieldset> elements. IMO this is more like a heading than a legend. It has no corresponding ARIA role. Therefore the nomenclature of role="legend" is likely to cause confusion.

The accessibility tree for a legend might contain roles "term" and "definition", which are already in ARIA, although the onscreen value of the "term" will typically be a dingbat, icon or non-alphanumerical glyph. Does it need an accessible name beyond the "definition"?

A legend may be associated with more than one graphics-document, so the reference should be made from the graphics-document, rather than from the legend. (I think).

A reference to the id of an element with a "legend" role could be expected in the context of role="graphics-document". Legends might also be used for data tables and image maps.

A legend might also be associated with graphics-document by making the legend a child or descendent of the graphics-document.

The legend use case also calls for the datapoints to be associated somehow with the appropriate legend text (i.e. the "definition") and again, this association should be programmatically determined.

Would it be appropriate (using today's ARIA) to insert the corresponding legend text into aria-roledescription of the datapoint that references it?

Inserting the legend definition into aria-roledescription is liable to be too 'chatty'. I imagine that a user will browse datapoints, and wish to suppress the announcement of the legend definition until they find a datapoint they are interested in. At this point, the AT might provide a way to announce the legend definition. A programmatic association between datapoint and legend definition would make this possible for authors, although AT vendors would still need to support it.

I'm imagining something like:

<svg role="graphics-document" aria-describedby="mylegend">
<g role="graphics-object" aria-roledescribedby="idOfLegendItemDefinition" aria-label="datapointvalue"><text role="none">&#9733;</text></g>
...
</svg>
<div role="legend" id="mylegend">
<div role="group">
<span role="term">&#9733;</span>
<span role="definition" id="idOfLegendItemDefinition">Outlier</span>
</div>
...
</div>

Programmatically associating datapoints with a small set of author-defined definitions (i.e. domain-specific semantic roles) might also handle many of the cases I mentioned in a post above, such as "is an outlier" or "is within a limit/boundary", although this deviates from the traditional notion of a legend as a visual key.

I've been looking into the accessible markup of transcripts and (screen)plays. A timestamp in a transcript, or the character name in a play have many similarities with the legend definition of a datapoint. The handle of a chatroom participant in a chat log is another example. It is "metadata" for each cue or utterance, and when browsing or skimming, the extra chatter will be unwelcome ...except that sometimes you do need it.

We had a case where the whole purpose of the app was to determine the exact time something was mentioned in an audio file. User should browse the transcript without timestamps until they find the relevant speech, and at that point, they need to get the timestamp. (This last problem is unsolved because transcripts are not operable). User Suppressable item definitions would handle this too, I think.

So, the main thing that's missing here is the programmatic connection mechanism between a list item/data cell/cue/datapoint and something with role="definition". Not an accessible name, and not an accessible description, but an accessible definition, made by reference or by value. I believe that such a mechanism would be generally useful.

I am not convinced that aria-describedby and aria-details are quite appropriate here, although they may be, in which case, let's put together a small working example.

cookiecrook commented 2 years ago

I believe Apple's data visualization API is the most fleshed out of any native platform public API, so a Web DataViz API should aim to support some or all of those features.

20m video and cross-links to documentation: WWDC 2021: Bring accessibility to charts in your app https://developer.apple.com/videos/play/wwdc2021/10122/

However, that particular pattern doesn't have a similar precedent in Web API, and some aspects, if ported directly, would conflict with the Web Platform Design Principles. These are some of the reasons we haven't proposed any specific Web DataViz API yet, but it's definitely an area of interest.

cookiecrook commented 2 years ago

AXChartDescriptor documentation https://developer.apple.com/documentation/accessibility/axchartdescriptor

AXLiveAudioGraph documentation (would possibly conflict with WPDP §2.9, but there may be ways around that) https://developer.apple.com/documentation/accessibility/axliveaudiograph

AudioGraph overview https://developer.apple.com/documentation/accessibility/audio_graphs

cookiecrook commented 2 years ago

Suffice it to say, I don't think a few roles and properties are sufficient to make accessible dataviz, but the web design principles pretty much preclude an accessibility-specific JavaScript API. This problem is especially relevant when you consider many web dataviz packages leverage the <canvas> element instead of DOM-based drawing.

monfera commented 2 years ago

This problem is especially relevant when you consider many web dataviz packages leverage the <canvas> element instead of DOM-based drawing. (updated b/c quote lost the 'canvas' context - jc)

True, though not for lack of trying on the developers' part, we wish the DOM were a competitive option.

Slow DOM performance (several OOM factors) makes us use Canvas2d, WebGL and in the future, WebGPU.

Except for mostly static, low element count datavis. But even then we often do not limit headroom by tying products to buggy, sometimes even regressing browser performance. There are also longstanding DOM rendering bugs which reduce trust that a somewhat special DOM use case (datavis) that works today will still work in the future.

There's much low hanging fruit in DOM performance, it "only" requires care, folks with domain expertise and work. Glad to talk with browser developers if the opportunity arises.

So <canvas> for datavis is here to stay.

brennanyoung commented 2 years ago

Thanks @monfera that's an important consideration.

TPGi raises some important problems about the HTML5 canvas spec in a useful article updated a couple of years ago. Clarifying the relationship between "fallback content" and the accessibility tree appears to be an important milestone.

Off topic for this forum, as such, but I'd certainly appreciate an ARIA practices example showing how best to provide a structured DOM alternative for a canvas presentation.

pkra commented 2 years ago

Just a general comment. I think this space needs an effort like openUI - a group of content specialists and developers who have extensive experience building such content / tooling and can come up with useful primitives. Complex diagrammatic content (e.g., org charts, floor plans, STEM diagrams) probably share a lot of the same problems.

A deep dive call could provide interested people with an opportunity to connect.

stes-acc commented 2 years ago

@stes-acc The chromium issue you linked (which I believe is not a bug) raises an important question: Should we use Tab to browse data points in whatever arbitrary arrangement? They are typically non-operable, so Tab seems like the wrong convention to adopt (IMO). Tab or not, we really need to an arrive at a standard UI pattern or convention for navigating data points, without being tied to tree and grid, which are often poorly-suited abstractions.

Not TAB, arrow keys. And on complex graphs, data points, bars etc. ARE often interactive (popup on activate with details is often implemented).

brennanyoung commented 2 years ago

@stes-acc Yes, although this approach is something one has to fumble towards. Some clear guidance would take a lot of the guesswork away.

I was pleased to discover that the (excellent) Chartability effort offers guidelines about tab stops in datavis.

https://github.com/Chartability/POUR-CAF (Operable failure 5).

They're also trying to establish "DX" as an inclusive replacement for datavis. (Because it's not just about the visual). I approve!

stes-acc commented 2 years ago

Well using Arrow keys will make TAB a skipping key by nature which is preferrable, I think.

brennanyoung commented 1 year ago

Apple just published some charting guidelines which mention several useful semantics which are not (currently) well-covered by ARIA. https://developer.apple.com/design/human-interface-guidelines/components/content/charts

Might we consider some of these for inclusion in ARIA?

Mark (an element which presents / represents a datapoint, subtypes: bar, line, point)
Axis (an element with an orientation, which provides a scale for mark elements to be plotted against)
Axis Value (a value which appears on or along an axis)

Some existing aria might be useful in conjunction with these, for example an axis might usefully have aria-orientation, aria-min and aria-max. A mark is inevitably going to need aria-details in many cases. Relationship attributes might associate marks with one or more axes.

I regard stuff like tick and gridline to be purely presentational, but I may be wrong. Perhaps there is a way to express similar affordances in a presentation-independent way.

ARIA 1.3 looks to introduce "mark" for something else, so another name is called for. Any suggestions?

monfera commented 1 year ago

Thanks @brennanyoung for mentioning this and also, the role of axes and ticks. Here are possible considerations for designing a non-visual data experience.

Multidimensional nature of data presentations: for example, a line chart with multiple lines, one for each category of something. A user may want to understand one line at a time as one unit (eg. via sonification like here) then the next etc. Or alternatively, a user may want to understand the first point of each line in one pass, then the second point etc. So, vertical vs horizontal traversal.

Line vs pointset duality: A line chart is most frequently a scatterplot, with values that increase strictly monotonically along one axis (typically, time), with adjacent points piecewise connected by line sections or other interpolation to show them as linked, or continuous. So the line is both a unit of its own, and also, just a set of data points. Even visually, each point on the line may or may not be reified by a point mark. And conveying connectedness of the points is mostly, navigability or playing sound for that line.

Line continuity and axis ticks: I tried continuous interpolation of audio frequency, and it conveyed continuity, though it did not sound pleasant (random ups and downs in frequency). It may be useful to mark tick/gridlines with sound (minor ticks: just a short beat/click, major ticks: more salient beat, or even, reading the value, again, like in the video), as the line is played through from (typically) left to right

Temporal vs spatial dimensions: a visual chart is understood to have spatiality, projected onto a rectangular Cartesian area, and it may or may not have a temporal direction, such as in the video (yield curve playing through time). When expressed via sound, almost everything gets serialized, so the spatial Cartesian dimensions fold into the temporal direction (as mentioned above, there are multiple legit orderings for that). Similarly, the visual choice of small multiples with a single line in each chart, vs. one chart with multiple lines is irrelevant for sonification, as the user may want to sample values for a specific time bin together anyway. Another example: a 2D heatmap uses colors or luma to convey values (as X and Y are exhausted) but would a 2D heatmap need to sound differently to a multi-series line chart or bar chart? Maybe not.

Serial vs. parallel conveyance of data points: on occasion or upon user choice, it may be useful to parallelize data, to more quickly communicate a population of data points. For example, sounding all data points on a yield curve at the same time loses the temporal information, but it's a quick way to hear the range of the data and have a feel for the distribution, eg. mostly low pitched sounds with 1-2 high pitched ones; also, get a feel for the count of distinct values. As the user has a legit choice to navigate the data in alternative ways (eg. per line, or per point / bin "left to right") what gets parallelized may vary.

Univariate vs. multivariate sounds: I've no idea if there are useful bivariate sound mappings, such as conveying a dimension via pitch, and conveying another dimension via something else, such as amplitude, or (categorical data) using a different wave form or even, musical instrument

Visual vs. non-visual channels: As many users use some level of vision while using a screen reader, it'd be useful to visually annotate the data points, or point in time as it is playing through. For example, by highlighting an entire line, or conversely, highlighting a vertical bar with all its data points on the different lines.

I'm not sure if it's any useful for establishing ARIA labels; just wanted to highlight the multidimensional nature of navigability. So the experience is less like the normal document model, which follows a tree structure (scenegraph)

brennanyoung commented 1 year ago

"the multidimensional nature of navigability" - indeed. Ideally it should be possible for authors to offer multiple 'pathways' through the data, and/or to let users decide (not sure how to facilitate this tho).

This is yet another reason why I disapprove of the following line in Understanding 1.3.2: Meaningful Sequence

Only one correct order needs to be provided.

Chartability (aka POUR-CAF) has a failure criterion which addresses this.

Information cannot be navigated according to narrative or structure Chart must provide a way to be navigated according to its data or narrative structure. The title, description, annotations, and then lower level data structures should be navigable and in that order. Chart data that contains sub-grouping (like a stacked bar chart) or nesting (like a treemap or hierarchy) must provide keyboard navigation that can navigate between levels and/or laterally across levels (in a non-linear fashion). Keyboard navigation must be comparable to the data structure (including cases where the data structure is novel) as well as provide linear or tabular navigation (like in a table or list).

brennanyoung commented 1 year ago

I don't think we can get around the need for a formal role for "chart", as distinct from "graphics-document" or "figure".

The spec for "graphics-document" emphasises its visual nature, and that it conveys meaning. (Potential superclass for "chart"?)

The spec for "figure" emphasises its autonomy in relation to the flow of the document.

A "chart" would differ from a "graphics-document" in the sense that it contains (or promises to contain) structured/articulated data.

I think "graphics-document" is rather too long, and (as currently specified) rather too vague to serve the purpose that we need for data representation.

monfera commented 1 year ago

tl;dr this post argues that chart is an overly hazy abstraction. Feel free to disagree or pitch something other than the projection proposed here.

Chart feels ambigous. The more rooted abstraction might be the projection (mapping a dimension, field, variable, to something we can sense). For example, projections of time, category and a metric or two intersect for a composite projection, and when we pour data into that intersection and render it visually, we get a line chart or bubble chart.

But a chart may arise from superimposing layers of projections that are related or even unrelated (think dual Y axis). For example, this projects a continuous time (line) and a coarser, discretized time to show distribution via beeswarm.

Each projection generally participates in multiple resolutions. The marginal scatterplot shares both the horizontal and vertical projection of the scatterplot proper with the distribution/density plot on each side. Is it one chart or three charts? Small multiples almost always share projections too. Financial and observability visualizations often stack multiple, heterogeneous chart types onto the same time projection "column". Visual and textual may mix. There are more complex examples like the third image here (Key performance indicators for water supply) where colors (categories) and marker sizes are shared, but Cartesian projections can be shared too.

More examples for how the chart abstraction becomes fuzzy are here. And even an otherwise regular HTML document may have multiple charts at random locations that share some projections, axes and legends, including data notebooks and journal articles.

Accessibility on the basis of projections (or bring your favorite abstraction) is not yet mainstream, but it may be useful to see if it's preferable over the familiar yet "leaky" unit such as the chart. Projections directly relate to the multidimensional navigation discussed earlier.

Also, what collides, or not, in the visual domain (eg. single chart with multiple time series vs. small multiples with one line each) doesn't need to automatically do so in the non-visual domain, where the reader may opt for alternative serializations and flythrough. Screenspace chart sizes don't matter, except for emphasis. Datavis design (visual layout) needs to resolve visual constraints such as overplotting that doesn't exist in non-visual spaces, and it feels counterproductive to limit non-visual conveyance based on visual constraints we take for granted.

Orienting accessibility around the concept of the chart might be hidden, unnecessary bias toward the visual domain, besides its ambiguity problem. Indeed, accessibly conveying data is quite unlike an image or figure

brennanyoung commented 1 year ago

I'm happy to compromise on the name! I agree that the point is not to remediate something visual, but to offer a first-class, presentation-agnostic alternative to a visualisation. Funny how "chart" does not have quite the same visual baggage for me.

Implicit in your comment, @monfera is that "graphics-document" might not be the most logical superclass to go with.

brennanyoung commented 1 year ago

Suggested role: "series" (nomenclature TBD).

Should be used in relation to a "chart" (or "projection" whatever the parent data representation role ends up being called).

I imagine this role as an alternative or secondary path (id list?) through a subset of the presented dataset, referring to the data points as they may appear in the DOM, or as plotted along an axis, but with its own distinct order. There may be more than one for a single dataset.

It should permit naming and description. An example name might be "upward trend".

I expect that this role would be applied somewhat "editorially" to the content, to tell a story, or emphasise a particular interpretation. Nonetheless, it should be machine-readable, such that ATs can present the series contiguously. (e.g. a "next in series" shortcut).

A similar idiom "cluster" might be useful, distinct only in the sense that the contents are unordered. There is no way for ARIA to distinguish ordered/unordered lists, so maybe this distinction is not necessary.

monfera commented 1 year ago

A pitch for relations: Thanks @brennanyoung for bringing up the need to clarify what constitutes data that we pour into the intersection of projections, for the user to perceive, explore and navigate. Here's my contrib attempt for terms/concepts toward its modeling. I'm now more interested in the conceptual part rather than user-facing role names, which I agree is downstream of concept making:

Series implies a single, specific, full order (to represent things like order/connections in a connected scatterplot), and cluster hints at clusteredness (clumps) while your intent covers arbitrarily scattered data that's unordered, irrespective of the task, or what's observed (clusters, correlation etc.). Also, even unordered data usually have several implicit, total orders, such as the two orthogonal spatial projections in the Cartesian XY chart setting (you can unambiguously tell if a scalar property of a data point is less than, equal to or greater than that of another data point).

Series and cluster/scatter imply finiteness, which may not be the case. It's OK to visualize a continuous function (=infinite points) such as a sine wave, or a rocket's trajectory. In graphical rendering, typically a finite set of data points is the starting point, and an approximative model ([monotone](https://en.wikipedia.org/wiki/Monotone_cubic_interpolation, EWMA, LOESS etc.) provides the continuity. There's also the pixel-based rasterization during rendering, irrelevant for non-sight based communication.

A descriptive, precise term that'd admit the above concerns would be set, which you instinctively used too as part of dataset.

A set

has a well-defined meaning
is pertinent across lots of domains, not just datavis related work (math, art, CAD, design, content creation) and set operations (union, intersection, subtraction, subsetting, partitioning) are also at disposal
does not imply or prevent orderedness
like the mentioned cluster/scatter, it's unordered
can be ordered, when using an ordered set (poset or total order)
via the poset concept, covers partial orders, and by extension, total orders and complete lack of order (so, both series and cluster/scatter)
multiple orders are admitted (multiple poset/toset with the same elements)

At the risk of prematurely typing the elements of the set, it is often the case that

each element (= data point) has multiple attributes, as most data experiences are multidimensional, even the simplest line (such as value in function of time) even if some dimensions may be implied and omitted from the physical representation
each element in a given set tends to have the same set of attributes (whether it can be omitted or not is another question)

So each element is a tuple, and these tuples have common attributes. This also gives us the means for implicit orders ("link every point with its temporally adjacent neighboring points" ie. line chart) because you can rely on attribute values (such as time or X) for all elements of the set.

A widespread, well-defined term for a set with homogeneous tuples is the relation.

The relational model is not immediately general enough: what about trees, networks and graphs? What about rasterized data? Text? Yet in practice, these can generally be modeled as relations too (here we don't talk about physical representation).

Relations also immediately suggest expressive powers that I think are great for accessibility:

Ability to traverse a relation along a dimension of the user's choosing (eg. time), while enumerating or aggregating data along the orthogonal dimensions to get each value for a time bin, or get the average, median, min/max etc. for the data. A sighted user may perceive a beehive plot bin with lots of points and immediately grok its distribution. For non-sighted users, just hearing a lot of values won't give the same. The power or relational algebra makes it natural to aggregate the lots of points into, in effect, an audible boxplot (min/max/quantiles), or an audible or tactile violin plot (density curve)
Ability to enrich the presentation on the user's volition: a sighted user may immediately, at one glance, perceive a bunch of points that surface some meaningful, higher level pattern in the data (eg. trend, seasonality, correlation). While creators, publishers should attempt to enrich visualizations with overlays such as trendlines for both sighted and unsighted users, it'd be empowering to give readers the means to do so themselves. A simple, predictable, tabular data structure with decades of experience in its analytics would let users feel enabled to experience data (my two cents)
Ability to correlate multiple renderings of the same data. Even in the simplest chart, there's usually a legend, which shares some dimensions (color, marker size, marker type) with the chart proper. Composite charts such as stacked time series charts or marginal scatterplots, almost by definition, also share dimensions and their value sets. The relational model immediately leads to established patterns for correlating the multiple, shared occurrence of various dimensions, to let people with various perception ability to understand what's conveyed. Such as joining data via attributes.
Predictable types for accessibility: the relational model tends to have primitive, directly conveyable data in its "cells". For example, numbers, booleans, date/time, strings, or enumerations (categories). So the aural and other rendering modalities have very few, concrete types to work with (a more generic, nebulous "data" concept doesn't have this property)
The relational model is a natural way to represent data, and visualizations / data experiences, whether visually or otherwise
It's neutral and permissive (supportive) with respect to the task at hand, which can be diverse and plural: seeing trends, seasonality/periodicity, continuity vs. gaps in the data, correlations, support for predictions, understanding the distribution or density of data, part-to-whole relationship etc. It'd be near impossible to standardize and directly support the task or intent, while relations and operators on them support diverse use cases in an open-ended way
All the benefits of sets incl. various options for orderedness (and orders feel important for the usually more serial ways of conveying data) including giving users the power to order things for their own use (the maker wouldn't "dictate" that a chart must be read across time; maybe a user wants to serialize it across values, highest to lowest)

tl'dr the concept of relations helps us model the domain via established, well-defined concepts, where they're available, cover the use cases well, lead to accessibility powers and don't impose undue burden.

brennanyoung commented 1 year ago

monfera I strongly approve of using existing terminology, and getting away from sensory bias where possible.

Did you imagine that 'relation' might be a role? I can't help recalling Gregory Bateson's observation "A role is a half-assed relationship" :D

Closest role today is perhaps "group", but it's heavily tied to DOM structure. The challenge will be how to express a relationship between parts that can be independent of DOM order. Any ideas on this? Could something like aria-flowto (with a stronger spec) be viable?

Or maybe some clever use of aria-owns?

<g role="relation" aria-owns="datapoint14 datapoint22 datapoint3"></g>

monfera commented 1 year ago

@brennanyoung Nice one with the "relationship" observation 😄 I'm out of my depth here, esp. on how to transcend the document tree, which is, in the case of dataviz, often either arbitrary or can encompass one specific main traversal structure

brennanyoung commented 1 year ago

Use of aria-details to denote an arbitrary sequence of datapoints:

<g role="graphics-object" aria-details="datapoint14 datapoint22 datapoint3"><text>upward trend</text></g>

brennanyoung commented 1 year ago

Just stumbled on this ageing wiki page. Many good ideas here. Whatever happened to them? https://www.w3.org/wiki/SVG_Accessibility/ARIA_roles_for_charts

brennanyoung commented 1 year ago

Regarding WCAG SC 1.4.1 Use of Color, I propose a formal mechanism whereby elements/datapoints in a collection (e.g. in a list, table or data view) can be classified using human readable text strings, a bit like enumerated types, perhaps similar to the class attribute, but with the intention that those text strings will be exposed to ATs.

The key point here (distinct from, say, aria-details) is that the same name/token may be used by more than one element in the set.

jnurthen commented 11 months ago

Not sure what progress we can make but this seems potentially worthy for discussion at TPAC

spectranaut commented 9 months ago

Discussed at TPAC this week: https://www.w3.org/2023/09/11-aria-minutes#t04

Next steps: form smaller group, invite high charts developers, @cookiecrook already in discussion with them.

mpaiva commented 1 month ago

@jonathanzong - this would be a good discussion for you to follow.

frankelavsky commented 4 weeks ago

TLDR: 2 potential problems that I see are 1) HTML is fundamentally hierarchical, which limits what it can do and 2) not all meaningful relationships in an interface are known by the author, beforehand.

I've been following this thread for years now (spoken to many of the folks here too) and just wanted to say that in my latest project I have tried to work towards a toolkit that can build navigational and interactive structures for highly complex data relationships. The project is Data Navigator.

To me, the foundational substrate we are missing (which is discussed here) are programmatically determinable relationships between parts. I use javascript + a graph data structure to represent example models for this, but it would be nice to have some standardization towards non-JS type approaches (like really good ARIA, for example). Also, this can get a bit tricky...

One of the problems we (the proverbial "we") keep running into is that document structure (content) is often coupled to navigation. In a way, that makes sense. You'd want to navigate content according to the structure declared for that content. Most of the time that fits and works. But in data visualization, data structures, and spatial interfaces, we often have representational structures that don't correspond to the document or technical tools we use. HTML is capable of list and hierarchy structures, which can be (ideally) programmatically determined by the way the document is laid out. But when we make something with HTML, SVG, or canvas, the visual structure often does not actually correspond to the underlying technical structure used to create those visuals.

However also: authoring navigable HTML requires a priori knowledge of what relationships we want and know we will create in that content. But of course the visual arrangement of data encodings can accomplish the production of relationships that arise due to their rendering. These relationships are determined a posteriori; they are observed. A cluster in a scatterplot is an example of this: the cluster is often not known in the authoring/analysis process or even determinable from the rendering's programmatic representation (such as in SVG or pixels). Analytics tools and applications are famously hard to make accessible, even in terms of alt text, because of this a priori-a posteriori tension. Authors build these tools for a self-service style of interaction, where users can explore and discover using an assembly of parts or interface elements. Knowing what the user creates beforehand is sometimes nearly impossible.

So in some ways, I am not sure that having an ARIA approach that is still based on the assumption that all relationships are capable of being known a priori will be adequate moving forward. It might be valuable to have ARIA built for exploratory navigation: such that some relationships might be present as well.

And apologies for throwing a big wrench in this (potentially!). For what it's worth, I am not suggesting that what has already been said and discussed is wrong or anything but rather I just want to add to it.

mpaiva commented 5 days ago

I completely agree with the need for a collaborative effort across working groups to address the challenges in data representation.

I'd recommend start by redefining the term “visualization”, which inherently carries a bias towards sighted users. To ensure equitable access, data should be “represented” in a way that is inclusive of all users.

I can't wait to start contributing to this and discussing ways for utilizing multiple modalities to represent data for all users, including those who are neurodivergent or has cognitive disabilities.

I remember @jnurthen wanting to add this as an agenda for this year.

LJWatson commented 5 days ago

I'd recommend start by redefining the term “visualization”, which inherently carries a bias towards sighted users. To ensure equitable access, data should be “represented” in a way that is inclusive of all users.

I understand what you're trying to say, but as the blind person that opened this issue, I respectfully disagree that "visualize" is inherently biased towards sighted people.

Most reputable dictionaries include definitions like "form a picture of someone or something in your mind" or "to see or form a mental image of".

In this case, the word "visualize" is being used in both ways: the literal visualization of data to aid sighted people in understanding data, and ways to offer people who cannot see a comprable way of visualizing the data.

mpaiva commented 5 days ago

Thank you for opening this thread and for your thoughtful response. I deeply appreciate your perspective and the clarity you’ve provided.

To further the discussion on inclusive data representation, consider these examples of multiple modalities:

Auditory Descriptions: Providing spoken descriptions of data for users who are blind or have low vision.
Tactile Graphics: Using raised-line drawings or 3D models to convey information to users who are deafblind.
Simplified Formats and Summaries: Offering clear, concise summaries and interactive elements to aid users with cognitive disabilities.

People learn in various ways, including visual, auditory, reading/writing, and kinesthetic styles.

May I suggest initiating a user research effort to help us understand the diverse needs of users and develop inclusive and accurate terminology. I am willing to contribute to this effort to ensure we create a comprehensive and effective approach.

Thank you again for your valuable input @LJWatson