libsdl-org / SDL

Simple Directmedia Layer
https://libsdl.org
zlib License
9.92k stars 1.84k forks source link

Expose accessibility events/callbacks #9351

Open DataTriny opened 7 months ago

DataTriny commented 7 months ago

Context

Applications which want to expose an accessibility tree must be able to respond to some OS-specific events/callbacks:

This feature is becoming increasingly important now that it is easier for applications which draw their widgets themselves to provide an accessibility tree by using AccessKit.

Requirements

Possible solution

Below is an attempt at solving this issue on Windows. Other platforms could use a similar approach as well.

typedef struct {
    // Invoked when the window procedure receives a WM_GETOBJECT message.
    LRESULT (*handleWMGetObject)(WPARAM, LPARAM);
} SDL_WindowsAccessibilityProvider;

// Used to create a new provider for the window designated by HWND.
// Called when the window procedure receives a WM_CREATE message, before the window is shown.
// The resulting provider would be owned by the SDL window.
typedef SDL_WindowsAccessibilityProvider* SDL_WindowsAccessibilityProviderFactory(HWND);

// Must be called before SDL_CreateWindow
void SDL_RegisterWindowsAccessibilityProviderFactory(SDL_WindowsAccessibilityProviderFactory factory);

A new SDL_WINDOW_ACCESSIBLE flag could be passed to SDL_CreateWindow to control whether it should use the accessibility provider factory.

Alternatives

The registering process could be simplified if it was possible to start the event loop before opening a window. The first time accessibility is requested by the system, an event could be pushed to the queue. Upon reception of this event by the application it could create its accessibility provider and give it to SDL.

The winit windowing library is considering this approach, but this would be a big change of the SDL API.

Prior art

glazier is a cross-platform windowing library which allows applications to deal with accessibility-related events.

slouken commented 7 months ago

You can get access to Windows messages as SDL is processing them by using SDL_SetWindowsMessageHook(). You can get the NSView from a window by getting the NSWindow using the SDL_PROP_WINDOW_COCOA_WINDOW_POINTER property and, depending on how you're rendering graphics, you can either grab the first child view, or look through the child views for one with the tag specified by the SDL_PROP_WINDOW_COCOA_METAL_VIEW_TAG_NUMBER property.

I think you have access to the things you need to do this, but it's not in an easily accessible form (hahah) so if you'd like to create and contribute a sample program showing how to use accessibility with SDL, that would be great!

DataTriny commented 7 months ago

Thank you @slouken for the quick response. I am aware that it is already possible, in fact we already have a working example on AccessKit repository.

However this require very unsafe hacks on our side, such as Win32 subclassing and dynamically modifying the NSView. Due to how accessibility stack works on these platforms, we also have to create the window hidden before we can initialize accessibility in the app.

BTW SDL_WindowsMessageHook is not useful to us since, to me knowledge, we can only return a boolean from this. However, Windows expects us to return pointers to accessibility providing elements.

Since accessibility is forced into the message loop or root window object on these platforms, I think it is SDL's responsibility to pass along these events/calls to the app. We don't have such requirements on Linux for instance.

slouken commented 7 months ago

Feel free to put together a proposal for SDL 3.0.

DataTriny commented 7 months ago

@slouken I didn't find any instance of this, but I'm not intimately familiar with the SDL API. Are there already event types which expect a response from the application? If we could expose accessibility events as regular SDL events that would probably be more elegant.

DataTriny commented 7 months ago

@slouken I have updated the issue description with a "draft" of how such an API would look like. I'd appreciate if you could tell me whether it is going in the right direction and if it is somethin you would consider having.

Many thanks!

slouken commented 7 months ago

I'm having trouble imagining how this would work in a cross-platform way. Maybe it would make more sense to take an existing accessibility toolkit and add the hooks it would need to work properly?

DataTriny commented 7 months ago

We would have to create specific accessibility providers for each platform as there is zero commonality between the accessibility stack of each platform.

Maybe it would make more sense to take an existing accessibility toolkit and add the hooks it would need to work properly?

As I wrote above, this is currently not possible (especially for the "properly" part) because the "hooks" sent by the OSes are received by SDL and currently ignored.

Having cross-platform accessibility into SDL would mean adding a ton of UI-specific code that just don't belong here. Just look at AccessKit to understand the amount of work needed for this. Note that tying SDL to AccessKit would make things much easier for SDL users as they would only have to deal with AccessKit's cross-platform accessibility concepts, the low-level specifics I am proposing to expose here could be completely hidden.

Therefore I am just trying to find a minimal API that would satisfy the constraints of platforms where there is an accessibility stack, without adding too much into SDL. Since this whole thing would be a mess, it could be isolated into its own subsystem.

slouken commented 7 months ago

I think this would best be done as a PR so people can take a look at it and try some real-world use cases. I'm not opposed to it, I just don't have the real-world experience with these systems to be able to give you an educated response.

So I guess the answer is a tentative yes? :)

I'd also like to really understand what kind of accessibility this enables and how games could make use of it.

DataTriny commented 7 months ago

Let's take an example to illustrate what accessibility means here.

kivy is a cross-platform UI tookit which relies on SDL to get its window, get a rendering context and interact with input/output devices. It uses OpenGL ES 2.0 to show its widgets to the user.

This project would like applications built with this tool to be usable by as many people as possible, which include people with disabilities. Because they currently only draw stuff on the screen, they can only be used by people who are able to see for instance.

They could choose to provide an alternative experience for users with different needs, by supporting other output methods such as speech synthetizers, Braille displays, etc... and support other inputs such as voice recognition, eye tracking... They would need to find a way to ask users for their preference and hopefully store this configuration in a central place. This configuration would have to be very rich. Just for a speech synthetizer alone, it would have to contain the language, the type of voice, the speech rate, intonation, the volume... All of this would of course only be valid in apps built with kivy, other toolkits would have to come up with their own solutions.

Acknowledging this problem, the major operating systems came each with their own solution. While technically very different, they are all based around the same concept of giving a way for apps to describe their content in a rich format. This format is a semantic representation of what is currently present in the application. It is able to describe elements by attaching semantic roles to them. These elements can be further described with a name, a description, a value and many more properties and attributes to convey their current state. This is what we call the accessibility tree, as elements can be linked together using various relation type such as parent/child, describes/is described by...

When an application is behaving correctly by exposing this accessibility tree to the rest of the system, this tree can then be queried by assistive technologies, software used by people with disabilities to use their computer in an unconventional way. These assistive technologies can make sense of the accessibility tree and present it in a way that is useful to the user, taking their choices and preferences into consideration. The burden of serving each type of disability is shifted from the applications to the assistive technologies. Because the accessibility tree usually offer a way to be manipulated by external entities, assistive technologies such as voice control can perform actions on behalf of the user. Accessibility tree are also very useful to automation tools.

To tie it back to SDL's primary audience of game developers: there is a whole category of games called audio games, which most of the time don't display anything on the screen, relying only on audio for the gameplay. Because the infrastructure is not there, they are all forced to directly interact with speech synthetizers to convey information, or link to APIs offered by some assistive technologies. These APIs are very limited in their abilities and apps must support every assistive technologies they expect their user base to use. Needless to say the experience is not good.

From SDL's point of view, the accessibility tree is just a weird input/output device as well as a sort of renderer.

I hope this clarifies things up and can help you understand the necessity of such feature. I will try to prepare a PR to hopefully move forward with this.

slouken commented 7 months ago

Thank you for the clarification. I am looking forward to your PR.