WICG / speech-api

Web Speech API
https://wicg.github.io/speech-api/
145 stars 31 forks source link

getVoices() is supposed to be user agent dependent, but appears not to be. #113

Closed justwlocke closed 8 months ago

justwlocke commented 8 months ago

Pardon me if this is not the right place to bring this up, but I got here from this documentation which itself was linked to from this mdm page about WebSpeechAPI.

I've been trying to use the api to implement TTS on a site, and for the most part, it is working fine. The issue I ran into was that different browsers, and different platforms all supplied different voices for use. Of course, this makes perfect sense, why would my iPhone grab Microsoft voices? But I still had a feeling that there must be some way to get those remote voices no matter where I was, and I found this part of the documentation that backed up that feeling. That part being "It is user agent dependent which voices are available." Great! I know you can change/override user agents, so I tried to test it out on my laptop.

I am on laptop running Windows 10, and the two browsers I am using are Edge and Firefox. By default, when i getVoices() on Edge, I get the standard windows OS voices, as well as a large amount of remote "Microsoft" voices to choose from. In contrast, Firefox only gets the OS voices. When I went into Firefox's settings and overrode the useragent with the exact same useragent as my Edge browser, I expected to see the Microsoft voices appear as well. However, they did not appear. Here is the console in Edge image And here is the console in Firefox (with the useragent copied from Edge, and set in about:config) image

If getVoices() is truly useragent dependent, then I would expect to see the same amount of voices appear, even on different browsers. Is it an issue with the documentation? Is it an issue with the WebSpeechAPI? Or am I just understanding this incorrectly? Any information would be good.

foolip commented 8 months ago

Hi @justwlocke!

This is the place where https://wicg.github.io/speech-api/ is maintained, and the folks working on Edge and Firefox most likely won't see your question here.

When the spec says "It is user agent dependent which voices are available" that means it depends on the browser as a whole, not on the User-Agent HTTP header. Changing Firefox to have the user agent of Edge won't change which voices it has available.

I'll close this issue.