Accessibility regarding react speech recognition for blind people

crossLineApex commented 3 years ago

I'm working on a train ticket booking website which will be user-friendly with accessibility features but my main target is making it highly accessible for blind people.

Is there a way such that the speech recognition works with just the loading of website and on each react component, to be precise it should be voice controlled.

I'm planning to use a very simple UI for easy and precise screen reader and support speech recognition

Using a button to start speech recognition would hinder the very idea of the level of accessibility I require.

Can you suggest something or is there a way in the future to come up with something new ?

JamesBrill commented 3 years ago

Hi @crossLineApex this is an interesting use case and exactly the kind of thing I hoped this library would be used for.

A few comments:

Unfortunately, most browsers don't give us the choice of starting the microphone without a user action (I think Chrome is currently an exception, but I wouldn't count on that remaining the case). I believe they throw an error if the web app attempts to take control of the microphone without the user doing anything. This is most likely to avoid privacy concerns - i.e. websites that listen to you without you giving permission
Actually, using a button would be completely normal for a screen reader user. I'm not sure they would feel any more comfortable with a website automatically listening to what they say without them giving permission. You could create something similar to the skip to content pattern where the first element that a keyboard or screen reader user can focus is a button that enables speech recognition. You could put an ARIA attribute on that button that reads something like Click to enable voice control. Then say 'commands' to hear a list of voice commands.
An entirely voice-controlled web app is actually a difficult problem to solve, so I'm really interested to see what you come up with. You could use the SpeechSynthesis API to talk to the user, though I wonder if that would create too much noise for someone who is already listening to a screen reader. Perhaps well-written ARIA attributes are the way to go. You might also get some ideas from this Chrome extension
Beware that it might actually be easier for a screen reader user to control the web app using their normal input methods than voice. Possible exceptions can be complex queries like "I want a train from Doncaster to London on Sunday next week any time after 3 o'clock", though at that point you probably need to use some NLP techniques to extract the required information without making mistakes that frustrate the user. One problem with voice input is that transcription mistakes are common and often the user will need to provide corrections, which is especially difficult for someone limited to a screen reader. Think of those annoying robots that you sometimes get when phoning a call centre - it can be a frustrating experience if it isn't designed well. So make sure the voice commands are simple and unambiguous, are easy to discover, and have a mechanism for correction. That mechanism might be as simple as asking the user to confirm commands before submitting them, or giving them opportunities to repeat themselves

I would love to see the final results - perhaps they could help inform other web developers in designing voice-controlled web apps. It's a difficult problem and there isn't much research in this area yet, at least not for web.

crossLineApex commented 3 years ago

Hey James Brill

Thank you so much for giving a better insight. Couldn't have been better.

It's a hard thing ofcourse

I have come up with a very simple UI for blind as well as normal users

The first plan is to provide easy voice navigation on the website.

Next would be form input for destination, train status.

I want to make it in best of my abilities. It may not be enough but atleast I would like to attract other developers and societies that this is something to look for tommorow.

Thanks a lot again, I'll keep you posted

On Wed, 2 Jun 2021, 15:08 JamesBrill, @.***> wrote:

Hi @crossLineApex https://github.com/crossLineApex this is an interesting use case and exactly the kind of thing I hoped this library would be used for.

A few comments:

Unfortunately, most browsers don't give us the choice of starting the microphone without a user action (I think Chrome is currently an exception, but I wouldn't count on that remaining the case). I believe they throw an error if the web app attempts to take control of the microphone without the user doing anything. This is most likely to avoid privacy concerns - i.e. websites that listen to you without you giving permission

Actually, using a button would be completely normal for a screen reader user. I'm not sure they would feel any more comfortable with a website automatically listening to what they say without them giving permission. You could create something similar to the skip to content https://css-tricks.com/how-to-create-a-skip-to-content-link/ pattern where the first element that a keyboard or screen reader user can focus is a button that enables speech recognition. You could put an ARIA https://developer.mozilla.org/en-US/docs/Web/Accessibility/ARIA attribute on that button that reads something like Click to enable voice control. Then say 'commands' to hear a list of voice commands.

An entirely voice-controlled web app is actually a difficult problem to solve, so I'm really interested to see what you come up with. You could use the SpeechSynthesis https://developer.mozilla.org/en-US/docs/Web/API/SpeechSynthesis API to talk to the user, though I wonder if that would create too much noise for someone who is already listening to a screen reader. Perhaps well-written ARIA attributes are the way to go. You might also get some ideas from this Chrome extension https://www.handsfreeforweb.com/en/

Beware that it might actually be easier for a screen reader user to control the web app using their normal input methods than voice. Possible exceptions can be complex queries like "I want a train from Doncaster to London on Sunday next week any time after 3 o'clock", though at that point you probably need to use some NLP techniques to extract the required information without making mistakes that frustrate the user. One problem with voice input is that transcription mistakes are common and often the user will need to provide corrections, which is especially difficult for someone limited to a screen reader. Think of those annoying robots that you sometimes get when phoning a call centre - it can be a frustrating experience if it isn't designed well. So make sure the voice commands are simple and unambiguous, are easy to discover, and have a mechanism for correction. That mechanism might be as simple as asking the user to confirm commands before submitting them, or giving them opportunities to repeat themselves

I would love to see the final results - perhaps they could help inform other web developers in designing voice-controlled web apps. It's a difficult problem and there isn't much research in this area yet, at least not for web.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/JamesBrill/react-speech-recognition/issues/101#issuecomment-852874414, or unsubscribe https://github.com/notifications/unsubscribe-auth/ANOLTFIHMJEM6XMIZA3ZQJDTQX3YRANCNFSM453KPXZA .

JamesBrill commented 2 years ago

Closed due to inactivity.

JamesBrill / react-speech-recognition

Accessibility regarding react speech recognition for blind people #101