Closed Enteleform closed 11 months ago
Chrome sends a remote request for Google voices. Looks like Microsoft Edge is doing that, too. Notice the "Online" in the voice name.
I would suggest looking in to the URL that is being requested, then you can make the request yourself, see https://github.com/guest271314/GoogleNetworkSpeechSynthesis.
Unless you are doing something like what is described here https://github.com/guest271314/SpeechSynthesisRecorder/issues/17 or here https://github.com/edisionnano/Screenshare-with-audio-on-Discord-with-Linux or here https://github.com/guest271314/captureSystemAudio#pulseaudio-module-remap-source you are probably recording the microphone using SpeechSynthesisRecorder
. See the pinned issues.
Thanks for the info! For the project I was working on when I submitted the issue, I ended up using this:
https://github.com/Microsoft/cognitive-services-speech-sdk-js
This project collects data and sends it to Microsoft to help monitor our service performance and improve our products and services.
Doesn't sound appealing to me.
Are external requests being made for speech synthesis?
Yes. It requires an Azure account and API key to initiate requests via the JavaScript API.
It worked out pretty well for my use case. I ended up requiring precise pronunciation, which can be controlled via phoneme usage.
I'm interested in local speech synthesis processing.
Have you asked Microsodt to release their speech synthesis engine to the public as FOSS?
Have you asked Microsodt to release their speech synthesis engine to the public as FOSS?
This seems very unlikely since Azure is a significant source of revenue for Microsoft. They have a decent free tier though, so it works fine for personal projects.
Feel free to close this issue if you feel that it's out of scope for the project.
Since Chromium authors refuse to capture monitor devices with navigator.mediaDevices.getUserMedia()
we have to create an audio device that maps to speakers or other device output and set that is an input device so we can capture the output with navigator.mediaDevices.getUserMedia()
.
If you are requesting remote speech synthesis, you might as well bypass the middle-man and request the speech synthesis directly from the remote servers.
I would have archived this repository by now, however it is possible to remap to a virual device as detailed above.
I suspect Microsoft Edge is using an extension with a background HTML <audio>
element to play the sounds, so your microphone is not catching the output.
I'll leave it up to you to close the issue.
When setting
utteranceOptions.voice
to a "Natural" voice, the resulting audio contains only silence.For example, these are the default voices that exist on an unconfigured installation of Microsoft Edge:
Microsoft Edge Voices
The first 3 voices record as expected, but none of the subsequent "Natural" voices are captured.
Is there an additional step that must be taken in order for these voices to be captured?