w3c / aria-practices

WAI-ARIA Authoring Practices Guide (APG)
https://www.w3.org/wai/aria/apg/
Other
1.2k stars 334 forks source link

Can you provide an example of a non-modal dialog, and clarify this paragraph? #2536

Open mbgower opened 1 year ago

mbgower commented 1 year ago

Our experience has been that most dialogs are modal, as in the example you provide in Dialog (Modal). However, the pattern contains the following passage:

Like non-modal dialogs, modal dialogs contain their tab sequence. That is, Tab and Shift + Tab do not move focus outside the dialog. However, unlike most non-modal dialogs, modal dialogs do not provide means for moving keyboard focus outside the dialog window without closing the dialog.

We do not understand how a user can move focus outside a dialog if they do not use Tab/Shift+Tab to do so, and our experience is that a non-modal component would not constrain the tab sequence -- in fact the ability to Tab out of a component seems to be the defining behaviour for whether or not it is modal.

Can you please provide an example of a non-modal dialog, and explain what is intended by this passage?

JAWS-test commented 1 year ago

I suspect this comes from the desktop applications. There, non-modal dialogs get focus with F6. In web applications this also exists, but rarely.

See also: https://gist.github.com/mcking65/11882ebbe2889964c62ab5a16ab528c3

mbgower commented 1 year ago

@JAWS-test is there a web example, constructed or in the wild, you can point to? Even on a desktop app, the use of F6 to skip between interaction points is not constrained to dialogs. It can apply to things that are not otherwise constrained within their own tab rings. But let's keep this focused on web.

So it sounds like when this says "contain" it means "constrain", as in the APG is saying that a non-modal dialog constrains the tab ring to the dialog. That's interesting, and leads to a lot more questions.

@mcking65 hoping you can provide some clarity on this.

JAWS-test commented 1 year ago

@mbgower In Microsoft Word (web version) there are 4 application areas (title bar, menu, text input area, footer) that are not reached with TAB. The four areas are not modal dialogs. They are reached with CTRL+F6. Each area has its own TAB loop

mbgower commented 1 year ago

The four areas are not modal dialogs

But neither are they non-modal dialog, right?

Setting aside what such an interaction is, let's go back to a thing I was trying to understand.

If the position is that a dialog (whether modal or non-modal) constrains the tab order to itself, can you please explain what one of the cookie setting regions that appears at the bottom of many homepages are called? The user can tab into and out of the area, so I'm assuming by this definition they are not dialogs. What are they?

JAWS-test commented 1 year ago

Whether the page areas are non-modal dialogs can be debated. But what I think are non-modal dialogs in any case: Page areas in Word that are displayed during operation to specify the properties of an object (e.g. alternative text of a graphic). These page areas also have their own TAB loop and escape is only possible with F6 (desktop) or CTRL+F6 (web).

JAWS-test commented 1 year ago

can you please explain what one of the cookie setting regions that appears at the bottom of many homepages are called?

It depends on how they are implemented:

mbgower commented 1 year ago

non-modal dialog, when navigating between page and cookie message with F6

That seems like an extremely constrained definition of what a non-modal dialog would be. Microsoft doesn't even document that in its keyboard material, only referring to Ctrl+F6 to move between the ribbon and the main document. And it seems operating system specific; it doesn't work on my Mac, for example.

It seems like a real problem reserving "non-modal dialog" to describe something that is very rare on the web, when there is such a commonplace pattern for content that is presented in a dialog-like way, that frequently takes focus yet doesn't constrain tab order.

There are plenty of examples of what I'm talking about in Office365. For instance, a 'dialog' offering a series of tips appears and gets focus on first opening the application. These have every appearance of a dialog/alert/notification. The below example shows a classic x close icon, some information, "2 of 5" and then Back and Next buttons. Until this week, I would have called those non-modal dialogs. image These are not constraining focus. I can tab into and out of them. They are very prominent in the tab order and obscure material behind them. At some point I interacted in some way with one (forgotten how now), and the identical looking interaction was moved to being a floating area lower right corner of the screen, and put in the tab order at the end of the entire DOM structure. Another unfortunate but typical implementation.

Maybe this is just identifying a hole in ARIA, but this kind of interaction is commonplace, and can have a lot of challenges to it, none of which is helped much by classifying it as a section, article or aside. The "dialog" role is VERY descriptive of many aspects of the interaction. I just had a look at the dialog guidance, as well as that for aria-modal, and I don't actually see anything in the spec that should prevent me from using a dialog role with aria-modal set to false for these interactions. Am I misreading the spec?

Authors SHOULD ensure that all dialogs (both modal and non-modal) have at least one focusable descendant element. Authors SHOULD focus an element in the modal dialog when it is displayed


Whether the page areas are non-modal dialogs can be debated.

Let's hear the opposing debate! I really want to understand what qualities of the ribbon or menu would justify giving it a role of dialog.

JAWS-test commented 1 year ago

My suggestion would be to simply remove the subsentence regarding non-modal dialogs. The pattern is about modal dialogs and for these it is true in any case that the TAB navigation is restricted to the dialog. For non-modal dialogs I think both is possible: restricted or not restricted TAB navigation. As soon as there is a pattern for non-modal dialogs, everything else can be explained there.

JAWS-test commented 1 year ago

These are not constraining focus. I can tab into and out of them. They are very prominent in the tab order and obscure material behind them.

Just because Microsoft does it that way doesn't mean it's correct. I think, for example, that a non-modal dialog

Otherwise I have the problem that I can focus obscured elements with TAB, which is not useful.

mcking65 commented 1 year ago

@mbgower, @JAWS-test,

At TPAC, we gained consensus for my proposal to clarify this further in specs. We also discussed tab rings, and support for F6 on Windows and ctrl-F6 on Mac.

mbgower commented 1 year ago

Thanks, @mcking65. Ensuring the author gets those F6 keys will be a bit of a challenge (as opposed to the browser or OS taking them), but otherwise, this all looks good. It might help to have a summary comment added to your proposal, for those who don't want to read through it all to get the gist :) As with my comments on visual affordnaces at TPAC and elsewhere, recommending some differentiators to help everyone understand the nature of a window's boundary will help keep the experience consistent programmatically and visually. In the interim, I would like to advocate what @JAWS-test suggested as a near-term 'solution': remove the non-modal text from the Modal dialog pattern.

mcking65 commented 1 year ago

@mbgower wrote:

Thanks, @mcking65. Ensuring the author gets those F6 keys will be a bit of a challenge (as opposed to the browser or OS taking them), but otherwise, this all looks good.

We agreed that browsers should support the dialog element for non-modal dialogs with F6. That doesn't mean you couldn't make a non-modal without the dialog element, but you might want to do so only in special cases where the app provides other affordances for moving in/out of the dialog.

@mbgower wrote:

In the interim, I would like to advocate what @JAWS-test suggested as a near-term 'solution': remove the non-modal text from the Modal dialog pattern.

What problem would removing the referenced non-modal text solve? Removing the referenced text could be a source of additional problems. Because we haven't yet added a non-modal pattern and example, that text at least provides minimal guidance about non-modals. This is especially important now that we are finally inching toward standardization of non-modals on the web.

mbgower commented 1 year ago

@mcking65

What problem would removing the referenced non-modal text solve?

I reference that in the PR, but here it is again, in @JAWS-test's words:

My suggestion would be to simply remove the subsentence regarding non-modal dialogs. The pattern is about modal dialogs and for these it is true in any case that the TAB navigation is restricted to the dialog. For non-modal dialogs I think both is possible: restricted or not restricted TAB navigation. As soon as there is a pattern for non-modal dialogs, everything else can be explained there.

As per the conversation in the issue, it is unclear exactly what the existing text means. It is unhelpful. Once some non-modal guidance exists, the modal dialog pattern can cross reference it and users will have enough context to understand non-modal dialogs and the intended implementation, etc.

Removing the referenced text could be a source of additional problems.

In what way? This is a document on the modal dialog pattern. The change in text does not remove any information on how to construct a modal dialog, and in fact slightly clarifies expected behaviour.

mcking65 commented 1 year ago

I don't yet have a clear understanding of the goal of the changes. We need to be aligned on the problem before settling on a solution.

The main branch is frozen until we merge #2417, which I expect to happen between December 1 and 10. Once that is merged, the aria-practices.html file will no longer exist. If we were to make changes to the modal dialog pattern, they would be made in /content/patterns/dialog-modal/dialog-modal-pattern.html.

@mbgower wrote:

@mcking65

What problem would removing the referenced non-modal text solve?

I reference that in the PR, but here it is again, in @JAWS-test's words:

My suggestion would be to simply remove the subsentence regarding non-modal dialogs. The pattern is about modal dialogs and for these it is true in any case that the TAB navigation is restricted to the dialog. For non-modal dialogs I think both is possible: restricted or not restricted TAB navigation. As soon as there is a pattern for non-modal dialogs, everything else can be explained there.

A dialog, whether modal or non-modal, needs to contain its own tab ring. Otherwise, it is just another section of the page. The dialog role is a subclass of window.

As per the conversation in the issue, it is unclear exactly what the existing text means.

Please specify which text is ambiguous.

mbgower commented 1 year ago

A dialog, whether modal or non-modal, needs to contain its own tab ring. Otherwise, it is just another section of the page. The dialog role is a subclass of window.

First, my (poor) understanding is that from the point of view of the web, a dialog IS to some degree just another section of the page, and there is a lot of wizardry going on in the background to keep the rest of the page out of reach. The wording in aria-modal seems to support this.

When a modal element is displayed, authors SHOULD mark all other contents as inert (such as "inert subtrees" in HTML) if the ability to do so exists in the host language.

In other words, it's not a window in the way a new browser window is.

Second, a number of folks I interact with (the majority?) have been using "non-modal" to describe things that look like dialogs but do not constrain the keyboard (or pointer) to the window. I need to emphasize that these things visually present as dialogs. One only understands whether it is 'modal' based on whether one is not constrained in the 'window'. In almost every case, the test for if it modal is whether one can tab out of them. The F6, etc., navigation in the prior discussion is RARE -- to the point where I didn't know what the following passage meant about non-modals not constraining keyboard BUT disallowing tab navigation:

However, unlike most non-modal dialogs, modal dialogs do not provide means for moving keyboard focus outside the dialog window without closing the dialog.

Another important thing to point out is that for the pointer-biased designers out there, it's obvious when EVERYONE is constrained to a dialog. But there are a lot of interactions where clicking outside the 'dialog' can dismiss it, or even put focus elsewhere; but there may be no keyboard equivalent to that 'outside click'. So maybe what we need is a different pattern to describe such a component. They take focus, and then allow the keyboard (via TAB) to depart to content out of them (and with lightbox effects, this often visually looks like it's 'behind' them).This is a very common pattern, and it has obvious accessibility problems. It needs to be addressed. If they disappeared on loss of focus, a great deal of the accessibility problem would cease to exist. If you don't want to call this a non-modal dialog, we need something to call it, and a pattern to describe how to handle it.

There is also a pattern where something that visually presents like a separate window doesn't take focus, but the user CAN tab to them (hopefully). They do not interrupt user activity, but are there to be entered into and discovered. This interaction is relatively common for non-disruptive notifications. I suspect that some term that uses "notification" might be an idea for the pattern both the one that takes focus and the one that does not. Obviously alert can be used in concert with the one that takes focus, but it doesn't resolve the 'departure' from the notification 'window'.

mbgower commented 1 year ago

@mcking65 I'm not sure if you need evidence of what I'm saying above, but I see examples of clicking outside the modal window to dismiss them all the time. Here's one from bootstrap

Click the button to launch the modal. Then click on the backdrop, close icon or close button to close the modal.

A search on "dismiss modal by clicking outside" returned 14 million hits.

The APG doesn't really talk about pointer interaction much. I assume your stance is that one should NOT be able to click outside a modal to dismiss it? That seems to set up an equivalent user experience. But especially if no lightbox effect is employed (shading inert content) some users are going to be very confused when they are trying to click elsewhere on the screen and nothing is happening. (I can provide a live example in box where I get caught on this frequently.) So this is yet another example where if APG wants to bring about consistent operation across input mechanisms, we're going to have to provide guidance not just on keyboard interaction, but on presentation and pointer interaction. Otherwise, the trend is clear. More of these kinds of overlays including 'modals' are going to be getting dismissed by clicking 'away'. There are unfortunately a lot of ways keyboard users may encounter an unequivalent, likely poorer experience.

mbgower commented 1 year ago

This description of modal versus non-modal on Neilsen-Norman represents what I think is a midway point in a few trends:

A popup (also known as an overlay or popover) is a window or dialog that appears on top of the page content. A popup can be classified according to two dimensions:

  1. Whether the user can interact with the rest of the page:
    • Modal: the content on the page is disabled until the user explicitly interacts with the overlay.
    • Nonmodal: users can still interact with the background content (for example, by selecting links or tapping buttons) while the overlay remains visible.
  2. Whether the background is dimmed: If the background is dimmed, the popup is called a lightbox. There is no special name for the case when the background content is not visually dimmed.

Although in many cases lightboxes are modal, that is not always true.

So, the initially described modal experience seems to align. However, the non-modal description implies that a popover does not constrain interaction but does persist, even when existing. It is at least conceptually possible to have this model align with current APG guidance but only by forcing keyboard users to discover and press F6 or an equivalent. I don't see how that is going to be propagated gracefully, and I anticipate some unintended consequences.

Notice the author is not just talking about 'dialogs' here, but "popups", which encompass a variety of onscreen notices. The ease with which the author encompasses the two concepts speaks to a challenge in this space. Another big challenge of course exists when you get back to my prior posting about dismissing modals by clicking outside. That interaction is extremely hard to align with that described for non-modals here by NNG, where the popover persists regardless of user interaction outside of it.

The author does talk about "small, nonmodal overlays to communicate about these elements." So maybe we just need to agree on what that means (behaviour, appearance), what role it should get, and if "nonmodal" means something different here than when specifically used for dialogs.

It may be easier to have a conversation about this. It's a big space. I returned to it today because I discovered a modal in a design system I'm testing that is dismissed by clicking outside, and at this point I honestly don't know whether I should tell them that is a failure. The keyboard is restricted properly.

JAWS-test commented 1 year ago

I'm testing that is dismissed by clicking outside, and at this point I honestly don't know whether I should tell them that is a failure. The keyboard is restricted properly.

Why would that be a failure. I can't find anything in WCAG that prohibits this behavior.

I think with modal dialogs both is possible (in terms of outside clicking), nice to see on the page https://mdbootstrap.com/how-to/bootstrap/modal-close/ - today it opens with a black friday modal which can't be closed by outside clicking, but the real example can be closed like this

mbgower commented 1 year ago

@JAWS-test

Why would that be a failure.

Let me tackle this from a technical/philosophical perspective and then from the viewpoint of outcomes.

Technical

Here's key information from the APG:

Windows under a modal dialog are inert. That is, users cannot interact with content outside an active dialog window

If it's inert, then how is a user able to click on it and have a result? Nothing should happen.

I can't find anything in WCAG that prohibits this behavior.

I think that's mixing apples and oranges. The APG guidance is not normative, so anything it stipulates is tough to argue as a failure of WCAG. So if a modal doesn't trap the keyboard either, I'm not sure you could say that was a failure of WCAG either. But if I'm trying to have a design system have the best interaction, I'm going to have to point to the APG, and so I want that guidance to be consistent and stable.

Maybe saying "fail" was a poor use of language on my part. "Reject"/"flag" might be better. I have a strong enough relationship with this team that I can advocate for following APG guidance to get to a more delightful/consistent experience, without worrying only about pure WCAG failures.

I assume we may be able to start digging into the aria-in-html material, but that's beside the point of this issue (which is opened against the apg). I just want the guidance to be as robust as possible.

Outcomes

And that leads me to the other part of the discussion. If I allow the pointer to trigger outside the modal, I am creating a conflict between pointer interaction (click wherever the hell you want) and a keyboard interaction (you cannot do anything until you interact with this modal). Once we allow this unequal experience at such a fundamental interaction level, what hope do we have of equivalent interaction with the more nuanced considerations around interaction for non-modal dialog or a whole bunch of non-dialog overlay interactions hinted at in this issue?

JAWS-test commented 1 year ago

So if a modal doesn't trap the keyboard either, I'm not sure you could say that was a failure of WCAG either.

If the focus is not limited to the modal dialog, you would even violate 2 WCAG SC:

Apart from that - I think - the APG rules should be based on the default behavior: in HTML there is the dialog element and inert is also defined in the HTML specification

And that leads me to the other part of the discussion. If I allow the pointer to trigger outside the modal, I am creating a conflict between pointer interaction (click wherever the hell you want) and a keyboard interaction (you cannot do anything until you interact with this modal).

I think when click outside the modal dialog closes the dialog, the mouse and keyboard users have the same experience:

I think further discussion can be had at https://open-ui.org/components/dialog.research

mbgower commented 1 year ago

If the focus is not limited to the modal dialog, you would even violate 2 WCAG SC

Neither 2.4.3 and 2.4.7 are approaches that received consensus in AGWG discussion for failing such interactions. For the former, there is no requirement that only interactive elements get focus. For the latter, even if something obscures focus, there is still a 'mode of operation' that allows the user to bring it into focus. That is why Focus Not Obscured is part of WCAG 2.2. That will certainly give an ability to fail a lot of non-modal interactions that persist after losing focus.


In answer to your prior question, 'why should clicking outside to dismiss be a failure?', HTML states that as part of 'inert'

Hit-testing must act as if the 'pointer-events' CSS property were set to 'none'.

Does that not rule out the ability to 'click outside' to dismiss?


keyboard users also don't have to click the close button, but can press the ESC key

ESC may be the only way of dismissing something by keyboard (and that's fine). There are close functions where the 'X' is representative of close (and can be clicked by a mouse user) but the X doesn't need to take keyboard focus. It's not a failure because the user can issue Esc to close (or depending on context Delete, to remove).

mbgower commented 1 year ago

Also in answer to your prior question, 'why should clicking outside to dismiss be a failure?', modal dialog says a defining quality of a modal is that:

Application code prevents all users from interacting in any way with content outside of it.

Again, I would interpret that to mean that clicking outside the dialog should do nothing, not that it would dismiss the dialog.

JAWS-test commented 1 year ago

Application code prevents all users from interacting in any way with content outside of it.

This is also how the dialog element of HTML works when it is shown as a modal dialog.

However, I doubt whether closing the dialog by clicking on an area outside is considered a violation, because the elements outside cannot actually be operated, but only the dialog itself

mbgower commented 1 year ago

So clicking in an area outside the dialog and triggering a result is not "interacting in any way"?

JAWS-test commented 1 year ago

Yes, because you do not "interacting ... with the content outside of it"

Rather, the interaction takes place with the layer that the pop up places over the rest of the page. The layer can be recognized by the fact that it greys out the rest of the page. In fact, often a div element is used for this purpose, which logically belongs to the pop up and not to the page. The div is removed as soon as the pop up is closed.

mbgower commented 1 year ago

Here are a few problems this poses as a user interaction:

  1. There is no requirement for the lightbox effect you mention
  2. Even with the lightbox effect, a user can mistakenly dismiss an important dialog (say they are in the process of clicking somewhere else when the lighbox is triggered and their click mistakenly clears the dialog)
  3. Most crucially, it leads to a blurring of interaction equivalency between pointer and keyboard users.

Where modals require direct interaction by the mouse user (and not with the inert background) the result is it is much easier to discover modals that have failed to implement the inert behaviour properly -- a simple test click outside confirms it is inert. This significantly reduces the bigger problem with poor implementation, where the keyboard is not properly constrained and can tab out of the dialog and into the background (and often can't realize it because the background is obscured by the lightbox effect). Where clicking outside dismisses, the pointer user never encounters this poor experience, because the moment they interact with anything outside the box, it is dismissed. The equivalent for a keyboard user would be that the moment their focus leaves the notification (I won't call it a modal in such a situation), it disappears.

JAWS-test commented 1 year ago

I have lost the thread and do not even know what we are discussing here. What you are criticizing: Is it somewhere in ARIA APG?

I think there are two different forms of modal dialogues:

In the APG there is a modal dialog where click outside does not close the dialog (because it is a real modal dialog). There is also a second modal dialog where click outside closes the dialog and can also interact with the page in the background (is not inert). I think the example is wrongly classified - it should not be called a modal dialog (a datepicker is not a modal dialog - a listbox opened by a combobox or a menu opened by a menu button are also not modal dialogs).

mbgower commented 1 year ago

I have lost the thread and do not even know what we are discussing here.

Understandable, it's been a meandering conversation, albeit related in general to questions about the appearance, interaction and behaviour of modal dialogs -- and other interactions which are similar.

At heart, this issue is about the question: what constitutes a modal dialog, and what qualities in particular distinguish modal from other similar interactions, including a non-modal dialog? My feeling is that both keyboard and pointer interaction, as well as appearance and behaviour, need to included in that question.

I could try to make a cleaner issue, but maybe this is simply too broad a topic for an issue.

a11ydoer commented 1 year ago

@mbgower would you like to create another issue, you called "a cleaner issue" for this topic? Otherwise, we can close the issue.

patrickhlauke commented 1 year ago

very loosely, x-ref https://github.com/w3c/aria-practices/issues/599

also, while we're here, it may also be worth mentioning that non-modal dialogs (assuming they can float over a page that is in fact still active/usable/operable) will likely cause grief under the new 2.4.12 Focus Not Obscured (Minimum) / 2.4.13 Focus Not Obscured (Enhanced) SCs