matatonic / openedai-speech

An OpenAI API compatible text to speech server using Coqui AI's xtts_v2 and/or piper tts as the backend.
GNU Affero General Public License v3.0
467 stars 58 forks source link

Install fails on Windows // Deepspeed fails to install, No module named 'TTS' , piper-phonemize~=1.0.0 not available #24

Open ryanull24 opened 4 months ago

ryanull24 commented 4 months ago

When trying to run speech. py I am getting this error:

speech.py", line 333, in <module>
    from TTS.tts.configs.xtts_config import XttsConfig
ModuleNotFoundError: No module named 'TTS'

I am on Windows 11 with python 3.11.9

I really don t have a lot of experience with python and running programs, but what I did was:

git clone repo create a virtual environment .venv activate said virtual environment .venv\Scripts\Activate pip install -r requirements.txt run speech.py - getting TTS module error go back to virtual environment and install TTS get above error

Also an error when installing deepspeed with pip install -r requirements.txt

Collecting deepspeed (from -r requirements.txt (line 6))
  Using cached deepspeed-0.14.4.tar.gz (1.3 MB)
  Installing build dependencies ... done
  Getting requirements to build wheel ... error
  error: subprocess-exited-with-error

  × Getting requirements to build wheel did not run successfully.
  │ exit code: 1
  ╰─> [23 lines of output]
      [WARNING] Unable to import torch, pre-compiling ops will be disabled. Please visit https://pytorch.org/ to see how to properly install torch on your system.
       [WARNING]  unable to import torch, please install it if you want to pre-compile any deepspeed ops.
      DS_BUILD_OPS=1
      Traceback (most recent call last):
        File "F:\Project\LLM\openedai-speech\.venv\Lib\site-packages\pip\_vendor\pyproject_hooks\_in_process\_in_process.py", line 353, in <module>
          main()
        File "F:\Project\LLM\openedai-speech\.venv\Lib\site-packages\pip\_vendor\pyproject_hooks\_in_process\_in_process.py", line 335, in main
          json_out['return_val'] = hook(**hook_input['kwargs'])
                                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        File "F:\Project\LLM\openedai-speech\.venv\Lib\site-packages\pip\_vendor\pyproject_hooks\_in_process\_in_process.py", line 118, in get_requires_for_build_wheel
          return hook(config_settings)
                 ^^^^^^^^^^^^^^^^^^^^^
        File "C:\Users\USER\AppData\Local\Temp\pip-build-env-3vt8r0ws\overlay\Lib\site-packages\setuptools\build_meta.py", line 327, in get_requires_for_build_wheel
          return self._get_build_requires(config_settings, requirements=[])
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        File "C:\Users\USER\AppData\Local\Temp\pip-build-env-3vt8r0ws\overlay\Lib\site-packages\setuptools\build_meta.py", line 297, in _get_build_requires
          self.run_setup()
        File "C:\Users\USER\AppData\Local\Temp\pip-build-env-3vt8r0ws\overlay\Lib\site-packages\setuptools\build_meta.py", line 497, in run_setup
          super().run_setup(setup_script=setup_script)
        File "C:\Users\USER\AppData\Local\Temp\pip-build-env-3vt8r0ws\overlay\Lib\site-packages\setuptools\build_meta.py", line 313, in run_setup
          exec(code, locals())
        File "<string>", line 149, in <module>
      AssertionError: Unable to pre-compile ops without torch installed. Please install torch before attempting to pre-compile ops.
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
error: subprocess-exited-with-error

× Getting requirements to build wheel did not run successfully.
│ exit code: 1
╰─> See above for output.

note: This error originates from a subprocess, and is likely not a problem with pip.

I made sure torch was installed under the same virtual environment.

matatonic commented 4 months ago

deepspeed is probably being removed from default installations next release, you can comment it out of requirements.txt and reinstall.

thanks for the report!

Update: pip remove deepspeed

matatonic commented 4 months ago

if you need deepspeed you also need to install the cuda-toolkit for your os, which is perhaps more complex than installing the rest of the software...

ryanull24 commented 4 months ago

I do not really need deepspeed, I commented it out, and I am still getting the same 'no Module named TTS' when trying to run speech.py

And something else to note. On windows, piper-tts seems not to be able to be installed because of piper-phonemize

ERROR: Cannot install -r requirements.txt (line 4) because these package versions have conflicting dependencies.

The conflict is caused by:
    piper-tts 1.2.0 depends on piper-phonemize~=1.1.0
    piper-tts 1.1.0 depends on piper-phonemize~=1.0.0

To fix this you could try to:
1. loosen the range of package versions you've specified
2. remove package versions to allow pip to attempt to solve the dependency conflict

ERROR: ResolutionImpossible: for help visit https://pip.pypa.io/en/latest/topics/dependency-resolution/#dealing-with-dependency-conflicts

But this is an issue with piper-phonemize on Windows, looking at their github.

matatonic commented 4 months ago

Is this install in wsl2? I'm not much help with windows, sorry, but if you can use them, the docker images should work for you with docker desktop + wsl2.

ryanull24 commented 4 months ago

No, I m trying to use it standalone, on Windows, no docker, no anything like that, for 2 reasons.

  1. I am too dumb to understand Docker properly
  2. WSL2 would keep Vmmem.exe alive, using up memory and ending up crashing software like Adobe Lightroom. I don t know if this was fixed in the last year or so since I last used Docker.

I ll see if I can find a solution. It might be a me issue

ryanull24 commented 4 months ago

I managed to get it to start - thanks chatGPT, by adding

try:
    from TTS.tts.configs.xtts_config import XttsConfig
    from TTS.tts.models.xtts import Xtts
    from TTS.utils.manage import ModelManager
except ModuleNotFoundError as e:
    print(f"ModuleNotFoundError: {e}")

Now it starts, without any arguments, but then, after connecting to Open WebUI, and trying TTS, I get some more errors:

Traceback (most recent call last):
  File "C:\Users\USER\AppData\Roaming\Python\Python311\site-packages\uvicorn\protocols\http\httptools_impl.py", line 435, in run_asgi
    result = await app(  # type: ignore[func-returns-value]
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\USER\AppData\Roaming\Python\Python311\site-packages\uvicorn\middleware\proxy_headers.py", line 78, in __call__
    return await self.app(scope, receive, send)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\USER\AppData\Roaming\Python\Python311\site-packages\fastapi\applications.py", line 1054, in __call__
    await super().__call__(scope, receive, send)
  File "C:\Users\USER\AppData\Roaming\Python\Python311\site-packages\starlette\applications.py", line 123, in __call__
    await self.middleware_stack(scope, receive, send)
  File "C:\Users\USER\AppData\Roaming\Python\Python311\site-packages\starlette\middleware\errors.py", line 186, in __call__
    raise exc
  File "C:\Users\USER\AppData\Roaming\Python\Python311\site-packages\starlette\middleware\errors.py", line 164, in __call__
    await self.app(scope, receive, _send)
  File "C:\Users\USER\AppData\Roaming\Python\Python311\site-packages\starlette\middleware\base.py", line 189, in __call__
    with collapse_excgroups():
  File "C:\Program Files\Python311\Lib\contextlib.py", line 158, in __exit__
    self.gen.throw(typ, value, traceback)
  File "C:\Users\USER\AppData\Roaming\Python\Python311\site-packages\starlette\_utils.py", line 93, in collapse_excgroups
    raise exc
  File "C:\Users\USER\AppData\Roaming\Python\Python311\site-packages\starlette\middleware\base.py", line 191, in __call__
    response = await self.dispatch_func(request, call_next)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "F:\Project\LLM\openedai-speech\openedai.py", line 126, in log_requests
    response = await call_next(request)
               ^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\USER\AppData\Roaming\Python\Python311\site-packages\starlette\middleware\base.py", line 165, in call_next
    raise app_exc
  File "C:\Users\USER\AppData\Roaming\Python\Python311\site-packages\starlette\middleware\base.py", line 151, in coro
    await self.app(scope, receive_or_disconnect, send_no_error)
  File "C:\Users\USER\AppData\Roaming\Python\Python311\site-packages\starlette\middleware\cors.py", line 85, in __call__
    await self.app(scope, receive, send)
  File "C:\Users\USER\AppData\Roaming\Python\Python311\site-packages\starlette\middleware\exceptions.py", line 65, in __call__
    await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
  File "C:\Users\USER\AppData\Roaming\Python\Python311\site-packages\starlette\_exception_handler.py", line 64, in wrapped_app
    raise exc
  File "C:\Users\USER\AppData\Roaming\Python\Python311\site-packages\starlette\_exception_handler.py", line 53, in wrapped_app
    await app(scope, receive, sender)
  File "C:\Users\USER\AppData\Roaming\Python\Python311\site-packages\starlette\routing.py", line 756, in __call__
    await self.middleware_stack(scope, receive, send)
  File "C:\Users\USER\AppData\Roaming\Python\Python311\site-packages\starlette\routing.py", line 776, in app
    await route.handle(scope, receive, send)
  File "C:\Users\USER\AppData\Roaming\Python\Python311\site-packages\starlette\routing.py", line 297, in handle
    await self.app(scope, receive, send)
  File "C:\Users\USER\AppData\Roaming\Python\Python311\site-packages\starlette\routing.py", line 77, in app
    await wrap_app_handling_exceptions(app, request)(scope, receive, send)
  File "C:\Users\USER\AppData\Roaming\Python\Python311\site-packages\starlette\_exception_handler.py", line 64, in wrapped_app
    raise exc
  File "C:\Users\USER\AppData\Roaming\Python\Python311\site-packages\starlette\_exception_handler.py", line 53, in wrapped_app
    await app(scope, receive, sender)
  File "C:\Users\USER\AppData\Roaming\Python\Python311\site-packages\starlette\routing.py", line 72, in app
    response = await func(request)
               ^^^^^^^^^^^^^^^^^^^
  File "C:\Users\USER\AppData\Roaming\Python\Python311\site-packages\fastapi\routing.py", line 278, in app
    raw_response = await run_endpoint_function(
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\USER\AppData\Roaming\Python\Python311\site-packages\fastapi\routing.py", line 191, in run_endpoint_function
    return await dependant.call(**values)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "F:\Project\LLM\openedai-speech\speech.py", line 225, in generate_speech
    tts_proc = subprocess.Popen(tts_args, stdin=subprocess.PIPE, stdout=subprocess.PIPE)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Program Files\Python311\Lib\subprocess.py", line 1026, in __init__
    self._execute_child(args, executable, preexec_fn, close_fds,
  File "C:\Program Files\Python311\Lib\subprocess.py", line 1538, in _execute_child
    hp, ht, pid, tid = _winapi.CreateProcess(executable, args,
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
FileNotFoundError: [WinError 2] The system cannot find the file specified

Now trying to run speech.py with arguments, for example --xtts_device cpu --preload xtts, I m getting:

Traceback (most recent call last):
  File "F:\Project\LLM\openedai-speech\speech.py", line 339, in <module>
    xtts = xtts_wrapper(args.preload, device=args.xtts_device, unload_timer=args.unload_timer)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "F:\Project\LLM\openedai-speech\speech.py", line 61, in __init__
    model_path = ModelManager().download_model(model_name)[0]
                 ^^^^^^^^^^^^
NameError: name 'ModelManager' is not defined

At this point, I m pretty sure I m doing something wrong.

matatonic commented 4 months ago

it may need to be fixed to run piper.exe instead of piper, but honestly I think you're the first to try directly in windows.

matatonic commented 4 months ago

Try making sure you're using python 3.11, piper-phonemize does not work with python 3.12 yet.

matatonic commented 2 months ago

Any update here? I'll close the issue if you don't have any more to add, I still highly recommend you try the docker setup.

luobendewugong commented 2 months ago

it may need to be fixed to run piper.exe instead of piper, but honestly I think you're the first to try directly in windows.

Hello, we do have a strong appeal, directly installed on Windows, because you are very familiar with the overall architecture, could you provide some ideas directly installed on Windows?

Thank you very much!

piovis2023 commented 5 days ago

@ryanull24 did you manage to get it working on windows without docker?/

Thanks

ryanull24 commented 5 days ago

@ryanull24 did you manage to get it working on windows without docker?/

Thanks

I gave up. Look at my comment above. I had it start apparently, but it would throw errors in Open WebUI so I gave up on it. I am by no means familiar with coding, and I did not know where to start looking into things.

piovis2023 commented 5 days ago

Thanks @ryanull24 Yes I saw your comment. I get how you feel about not knowing where to start. Hopefully the awesome devs here (@matatonic , I'm looking at you mate :) ), have made some progress. I too don't want to use docker on Windows 11 but for different reasons. I want to streamline my techstack.

matatonic commented 5 days ago

Well, I didn't try this but you may be able to install a piper.exe binary directly without using pip, not sure if this will work either though, but it might get farther.

piovis2023 commented 5 days ago

@matatonic thanks for the quick reply. Really appreciate it. Where can I get my hands on a piper.exe file? Do you have one around? I'd be happy to report the feedback,

matatonic commented 5 days ago

@piovis2023 try this one: https://github.com/rhasspy/piper/releases/download/2023.11.14-2/piper_windows_amd64.zip from: https://github.com/rhasspy/piper/releases

piovis2023 commented 4 days ago

HI @matatonic Thanks again for the file. Unfortunately, it didn't work. I tried the piper, piper-phonemize, and piper-phonemize-crossover repos to go back to the roots. Aside from trying to compile the whole thing from scratch, there are absolutely no solutions for Windows 11 OS.

It sucks that this amazing project doesn't work on Windows 11 :(

seancheung commented 14 hours ago

This is due to piper-phonemize does not support Windows arch sadly. But you can use the latest binary release file from piper which supports Windows:

  1. Download the release file and extract it to a local folder
  2. Comment out piper-tts in requirements.txt
  3. Install deps as usual
  4. There are syntax errors in startup.bat. Replace it with the following. (There are also errors in download_voices_tts-1.bat and download_voices_tts-1-hd.bat, I just remove calling them. And the env file reading syntax is also wrong. I guess these were translated from bash script by GPT or something).

startup.bat

@echo off

@REM set /p < speech.env
set TTS_HOME=voices
set HF_HOME=voices

@REM call download_voices_tts-1.bat
@REM call download_voices_tts-1-hd.bat %PRELOAD_MODEL%

if defined PRELOAD_MODEL (
    set "preload=--preload"
)
python speech.py %preload% %PRELOAD_MODEL% %EXTRA_ARGS%
  1. update speech.py

speech.py

# line 226, remove the first arg which is "piper"
tts_args = ["--model", str(piper_model), "--data-dir", "voices", "--download-dir", "voices", "--output-raw"]
# line 232, add executable parameter
tts_proc = subprocess.Popen(tts_args, stdin=subprocess.PIPE, stdout=subprocess.PIPE, executable="absolute path to piper.exe")

Now it should work running startup.bat

piovis2023 commented 9 hours ago

You are a legend @seancheung ! . I'll let you know how it goes after Ive given it a go!