rhasspy / larynx

End to end text to speech system using gruut and onnx
MIT License
822 stars 49 forks source link

Fix issues to enable Windows support #62

Open rotemdan opened 1 year ago

rotemdan commented 1 year ago

Hi, I tried installing and running the package on Windows and encountered several issues. Fortunately, none of them were truly serious and I managed to get it (almost) fully working (both CLI and server) with some patches:

  1. phonemes2ids 1.0.0 fails to install on Windows. Upgrading the dependency to 1.2.0 fixed it.
  2. In server.py: _LOOP.add_signal_handler(signal.SIGTERM, _signal_handler) is not supported on Windows (getting NotImplementedError). Ignoring this error works around the issue. I tested that CTRL+C still aborts the server, although it may not do it as cleanly as with a handler.
  3. In utils.py, with tempfile.NamedTemporaryFile(mode="wb+" ... opens a temporary file in write mode. Later shutil.unpack_archive attempts to reopen the file while it is still opened for writing. This works on Linux, but not on Windows (unless the secondary opener passes a special low level flag to enable sharing of this sort). I worked around the issue by closing the file and later deleting it in a finally clause.

Other Windows issues:

  1. In the CLI, audio doesn't play since Windows doesn't have a default play command (it falls back to saving a wav file but unfortunately, the wav file contains only the last sentence when the text includes multiple sentences).
  2. Possible windows-specific error on the server: after a voice is downloaded the web interface requests null and gets a 404 error? I'm not sure why this happens but it doesn't seem to impact anything (edit: it only happens when using the web interface, not when requesting from the REST api, so I'm not sure it is platform-specific).

I'm really happy to be able to run this directly on Windows. Having to use WSL or docker isn't very convenient, and makes it difficult/impossible to import it as a library.

Please let me know if there's anything I've missed.