mediatechlab / tts-wrapper

TTS-Wrapper makes it easier to use text-to-speech APIs by providing a unified and easy-to-use interface.
MIT License
14 stars 9 forks source link

Add SAPI support #4

Closed mrx23dot closed 2 years ago

mrx23dot commented 2 years ago

To have a complete picture we could add SAPI for windows:

https://github.com/DeepHorizons/tts

gbottari commented 2 years ago

Hey @mrx23dot , thanks for the suggestion. I think that would be great. I'm working on a set of refactorings so we could add new engines like SAPI more easily.

mrx23dot commented 2 years ago

The project above works great (even with SSML), also with the IVONA voices SAPI is very competitive, considering the rate limits of online services.

So something like a wrapper could work, pip install TTS-Wrapper[sapi] even if it just installs pip install git+https://github.com/DeepHorizons/tts as dependency.

(I also noticed the head doesn't work anymore with python 3.6, yaml config makes install fail)

gbottari commented 2 years ago

Hey @mrx23dot, are you still interested in this integration with SAPI?

I've made some improvements recently and even added an offline engine for Linux called PicoTTS. I think it would be easier to integrate with SAPI.

mrx23dot commented 2 years ago

Sure, put it on a branch and I can test it on windows, cheers.

gbottari commented 2 years ago

Hey @mrx23dot. I've created a basic client for SAPI on this branch. Could you test it on windows?

Oh, and I found another lib called pyttsx3 that appears to be more interesting than DeepHorizon's as it also supports SAPI, eSpeak and Mac OS' NSS.

mrx23dot commented 2 years ago

It doesn't support python 3.8 :( (which is the last working one on win7) (3.7 would be better as minimum as requirements.txt agrees with that)

pip install "git+https://github.com/mediatechlab/tts-wrapper.git@feature/sapi"
Collecting git+https://github.com/mediatechlab/tts-wrapper.git@feature/sapi
  Cloning https://github.com/mediatechlab/tts-wrapper.git (to revision feature/sapi) to c:\users\mrx23dot\appdata\local\temp\pip-req-build-x63d5pcu
  Installing build dependencies ... done
  Getting requirements to build wheel ... done
    Preparing wheel metadata ... done
ERROR: Package 'tts-wrapper' requires a different Python: 3.8.8 not in '>=3.9,<4.0'

Not sure where this dep comes from, because everything installs from requirements.txt manually, even pyttsx3==2.90

from tts_wrapper import SAPITTS, SAPIClient
ts = SAPITTS(client=SAPIClient())
tts.synth('<speak>Hello, world!</speak>', 'hello.wav')

AttributeError: 'SAPITTS' object has no attribute 'synth

tts.synth_to_bytes('<speak>Hello, world!</speak>', 'hello.wav') tts_wrapper.exceptions.UnsupportedFileFormat: Format "hello.wav" is not supported by engine SAPITTS.

tts.synth_to_bytes('<speak>Hello, world!</speak>', 'hello') ts_wrapper.exceptions.UnsupportedFileFormat: Format "hello" is not supported by engine SAPITTS.

tts.synth_to_bytes('<speak>Hello, world!</speak>', 'wav') b''

from readme example: tts.synth_to_bytes('<speak>Hello, world!</speak>', 'hello.wav', format='wav') TypeError: synth_to_bytes() got multiple values for argument 'format'

Looks like it breaks the lib's api format, and when I force it with synth_to_bytes it gives empty output. At least there is no exception.

This works great isolated:

import pyttsx3
engine = pyttsx3.init()  # this generates the same file as sapi5, but leaving it empty is safer
engine.save_to_file('Hello World', 'test2.wav')
engine.runAndWait()
gbottari commented 2 years ago

Hey @mrx23dot , thanks for the help!

After some tweaks, I think I got it working now. It seems like the problem was how Python's NamedTemporaryFiles works on Windows: they can't be opened a second time by the filename.

Some of the problems you faced (like AttributeError: 'SAPITTS' object has no attribute 'synth') were due to recent API changes. Now, TTS objects have synth_to_bytes() and synth_to_file() while the clients only have synth().

Could you do a clean install and see if the following code works for you?

from tts_wrapper import SAPITTS, SAPIClient
ts = SAPITTS(client=SAPIClient())
tts.synth_to_file('Hello, world!', 'hello.wav')
mrx23dot commented 2 years ago

This one works great, it can be merged into master. Thanks! Just a tip to make OS voices suck less, try IVONA Voices, they are just as good as cloud AI ones.

gbottari commented 2 years ago

That's great @mrx23dot ! Thanks a lot!

I'll check IVONA later. Thanks for the tip!