One limitation of C_VoiceChat.SpeakText (https://warcraft.wiki.gg/wiki/API_C_VoiceChat.SpeakText) is that you can't change its behaviour. It simply queues all utterances.
This can be useful for "static" outputs like chat. But it doesn't work for "dynamic" outputs, such as a player constantly scrolling rapidly through a menu with many items, where each menu item is read out to the player.
To implement accessible interfaces for the blind, such as the aforementioned "audio menu", we need Sapi to clear the queue before speaking.
And we really need Sapi to do asynchronous output. There are literally tons of scenarios where queued output won't work. Think of a blind player entering a new zone and having the zone name read out, and at the same time targeting a mob, having its name and health read out. With just queued outputs that would delay the important and urgent mob name and health output would be delayed after the zone name, leading to an unplayable scenario. So we absolutely need asynchronous tts outputs. To make audio interfaces work for accessibility, asynchronous tts output is a must.
Requested Change
Please add a sixth parameter to C_VoiceChat.SpeakText
C_VoiceChat.SpeakText(voiceID, text, destination, rate, volume, flags)
to pass flags to SpVoice.Speak.
To meet both requirements real we need to provide flags for SpVoice.Speak.
This gives developers the flexibility to implement really useful audio accessibility features for the blind in your game!
Additional comments
But what about C_VoiceChat.StopSpeakingText? Couldn't you use that? (https://warcraft.wiki.gg/wiki/API_C_VoiceChat.StopSpeakingText)
Unfortunately it's not suitable. StopSpeakingText has a noticeable delay that interferes with fast SpeakText calls.
Again, consider a scenario where a player uses the down arrow to scroll through a list of options, and each option is read out to the player. Blind players are incredibly used to working with audio outputs, and they will do this very, very, VERY fast. You wouldn't believe how fast. :)
If I do call StopSpeakingText before calling SpeakText for the next output, this will work if the player is going through the list very slowly. Like waiting 0.5 secs before going to the next item. But if they are get faster, then the next output of SpeakText will start to be skipped if I do call StopSpeakingText before SpeakText. Here is a short demo of the problem: https://youtu.be/z2WpJuD3iUs
This may be because StopSpeakingText isn't native and has to go all the way through C-side and the SpVoice API, causing a delay that causes the SpeakText output to be skipped. Idk. :/
So, what we need is a native way to clear the queue. And that native way is the SVSFPurgeBeforeSpeak flag.
Adding OS-specific arguments to WoW API functions is a terrible idea
I understand. :) But having a tts with no asynchronous output and no reliable way to clear the queue is an even more terrible idea. :P Again, with all due respect, asynchronous output and reliably clearing the queue are not just nice to have. It is absolutely necessary for accessible interfaces.
Pain Point
One limitation of
C_VoiceChat.SpeakText
(https://warcraft.wiki.gg/wiki/API_C_VoiceChat.SpeakText) is that you can't change its behaviour. It simply queues all utterances. This can be useful for "static" outputs like chat. But it doesn't work for "dynamic" outputs, such as a player constantly scrolling rapidly through a menu with many items, where each menu item is read out to the player.To implement accessible interfaces for the blind, such as the aforementioned "audio menu", we need Sapi to clear the queue before speaking. And we really need Sapi to do asynchronous output. There are literally tons of scenarios where queued output won't work. Think of a blind player entering a new zone and having the zone name read out, and at the same time targeting a mob, having its name and health read out. With just queued outputs that would delay the important and urgent mob name and health output would be delayed after the zone name, leading to an unplayable scenario. So we absolutely need asynchronous tts outputs. To make audio interfaces work for accessibility, asynchronous tts output is a must.
Requested Change
Please add a sixth parameter to C_VoiceChat.SpeakText
C_VoiceChat.SpeakText(voiceID, text, destination, rate, volume, flags)
to pass flags to SpVoice.Speak.To meet both requirements real we need to provide flags for SpVoice.Speak.
SpVoice.Speak: https://learn.microsoft.com/en-us/previous-versions/windows/desktop/ms723609(v=vs.85)
Flags for the second parameter of SpVoice.Speak: https://learn.microsoft.com/en-us/previous-versions/windows/desktop/ms720892(v=vs.85)
The two important flags are:
Result
This gives developers the flexibility to implement really useful audio accessibility features for the blind in your game!
Additional comments
But what about C_VoiceChat.StopSpeakingText? Couldn't you use that? (https://warcraft.wiki.gg/wiki/API_C_VoiceChat.StopSpeakingText) Unfortunately it's not suitable. StopSpeakingText has a noticeable delay that interferes with fast SpeakText calls. Again, consider a scenario where a player uses the down arrow to scroll through a list of options, and each option is read out to the player. Blind players are incredibly used to working with audio outputs, and they will do this very, very, VERY fast. You wouldn't believe how fast. :) If I do call StopSpeakingText before calling SpeakText for the next output, this will work if the player is going through the list very slowly. Like waiting 0.5 secs before going to the next item. But if they are get faster, then the next output of SpeakText will start to be skipped if I do call StopSpeakingText before SpeakText. Here is a short demo of the problem: https://youtu.be/z2WpJuD3iUs This may be because StopSpeakingText isn't native and has to go all the way through C-side and the SpVoice API, causing a delay that causes the SpeakText output to be skipped. Idk. :/ So, what we need is a native way to clear the queue. And that native way is the SVSFPurgeBeforeSpeak flag.
Adding OS-specific arguments to WoW API functions is a terrible idea I understand. :) But having a tts with no asynchronous output and no reliable way to clear the queue is an even more terrible idea. :P Again, with all due respect, asynchronous output and reliably clearing the queue are not just nice to have. It is absolutely necessary for accessible interfaces.