Open guest271314 opened 3 years ago
What would the use case be? I think having a screen reader is enough.
The use case is to use espeak-ng
in the browser. Specifically, the capability to
espeak-ng --stdout
to browser and playback the audio in "real-time"espeak-ng
as a MediaStreamTrack
, for the ability to send the live stream via WebRTCI have already written the code. I am just asking if there is interest in the ability to achieve the list items at previous post in the browser.
@jaacoppi Use case references:
@guest271314, you may provide pull request for review for changes in eSpeak NG. Though, from use cases you mentioned, I suspect that should be part of different project implementing Web Speech API as an adapter for eSpeak NG, but not part of eSpeak NG itself. Possible start may be reviewing e.g. eSpeak.js project. Rationale is, that eSpeak NG already provides API (which may be extended and/or improved, if necessary), but adding more APIs for different programming languages, frameworks and platforms shouldn't be part of core project to avoid feature creep and unneeded complexity.
@guest271314, you may provide pull request for review for changes in eSpeak NG. Though, from use cases you mentioned, I suspect that should be part of different project implementing Web Speech API as an adapter for eSpeak NG, but not part of eSpeak NG itself. Possible start may be reviewing e.g. eSpeak.js project. Rationale is, that eSpeak NG already provides API (which may be extended and/or improved, if necessary), but adding more APIs for different programming languages, frameworks and platforms shouldn't be part of core project to avoid feature creep and unneeded complexity.
I have been filing specification and implementation issues re speech synthesis for several years now, too many to list here, brief summary https://github.com/guest271314/captureSystemAudio#references, most closed, none have been officially fixed.
The approach I employ re the PR that I will file within the next few days is to not change eSpeak NG at all. I use Native Messaging to start a local server, PHP passthru()
to pass the espeakng
command with --stdout
option set with fetch()
, parse streamed WAV file, write to WritableStream
side of a MediaStreamTrackGenerator
which provides a MediaStreamTrack
representation of the live stream. I am currently making minor adjustments to the pattern described at https://github.com/guest271314/NativeTransferableStreams/blob/web_accessible_resources/Explainer.md. When HTTP/3 over WebTransport
is implemented (https://github.com/aiortc/aioquic/issues/163) I will also include a version using that API, e.g., https://github.com/guest271314/webtransport/blob/main/webTransportEspeakNg.js.
Prior art https://github.com/guest271314/native-messaging-espeak-ng which downloads and builds eSpeak NG from this repository https://github.com/guest271314/native-messaging-espeak-ng/blob/ae6bbd087733d805e6baba4a35cfb695f03042f3/host/install_host.sh#L31. Chrome Apps are now deprecated.
There may be dozens of projects, which use eSpeak or eSpeak NG. Probably, projects you mentioned are much better to start implementing functionality you need. But, I still doubt, that this functionality is needed in eSpeak NG project itself.
Again, the browser extension does not change eSpeak NG itself, the extension simply provides a means to use eSpeak NG in the browser - specifically a means to get the raw output as bytes, and as a MediaStreamTrack
which can be used with WebRTC. The Emscripten port in this library does not parse SSML https://github.com/espeak-ng/espeak-ng/issues/736, and still uses the deprecated script processor https://github.com/espeak-ng/espeak-ng/blob/master/emscripten/js/demo.js#L26. I will also create a version for Firefox to use AudioWorklet
instead of MediaStreamTrackGenerator
, which is currently only supported at Chromium/Chrome.
I already implemented the functionality.
AFAIK no other projects implement the functionality described, perhaps save for meSpeak.js https://www.masswerk.at/mespeak/ which uses speak.js and does implement SSML parsing.
eSpeak NG project itself is not only *.c files, which makes executables and libraries. It also contains data, scripts, tests, documentation and other files. If you add new functionality to the project, question is, is it needed and who will maintain it. But, anyway, you can create pull request. Otherwise this conversation is too theoretical.
@valdisvi
If you add new functionality to the project, question is, is it needed
From my perspective, yes. See use cases at https://github.com/espeak-ng/espeak-ng/issues/972#issuecomment-877820223.
and who will maintain it.
I will.
I filed the PR for this.
Is there any interest in an extension which provides a means to run
espeak-ng --stdout
from and get raw output in the browser?