[Bug] RuntimeError: Numpy is not available

Vuizur commented 2 years ago

Describe the bug

On Windows you might get RuntimeError: Numpy is not available on Windows when trying to run a model.

To Reproduce

I installed the project from git using poetry add git+https://github.com/coqui-ai/TTS (not from pypi, because the latest version is still buggy on Windows due to the pyworld dependency) , and then executed poetry shell to get the command line program. Then I ran tts-server --model_name tts_models/es/mai/tacotron2-DDC.

It downloaded the model and I entered two sentences that I wanted to convert to audio, but it failed with an internal server error:

tts-server --model_name tts_models/es/mai/tacotron2-DDC                                   
 > tts_models/es/mai/tacotron2-DDC is already downloaded.
 > vocoder_models/universal/libri-tts/fullband-melgan is already downloaded.
 > Using model: Tacotron2
D:\Programs\tts-test\.venv\lib\site-packages\torchaudio\compliance\kaldi.py:22: UserWarning: Failed to initialize NumPy: module compiled against API version 0x10 but this version of numpy is 0xf (Triggered internally at  ..\torch\csrc\utils\tensor_numpy.cpp:68.)    
  EPSILON = torch.tensor(torch.finfo(torch.float).eps)
 > Setting up Audio Processor...
 | > sample_rate:16000
 | > resample:False
 | > num_mels:80
 | > log_func:np.log10
 | > min_level_db:-100
 | > frame_shift_ms:None
 | > frame_length_ms:None
 | > ref_level_db:20
 | > fft_size:1024
 | > power:1.5
 | > preemphasis:0.0
 | > griffin_lim_iters:60
 | > signal_norm:True
 | > symmetric_norm:True
 | > mel_fmin:50.0
 | > mel_fmax:7600.0
 | > pitch_fmin:0.0
 | > pitch_fmax:640.0
 | > spec_gain:1.0
 | > stft_pad_mode:reflect
 | > max_norm:4.0
 | > clip_norm:True
 | > do_trim_silence:True
 | > trim_db:60
 | > do_sound_norm:False
 | > do_amp_to_db_linear:True
 | > do_amp_to_db_mel:True
 | > do_rms_norm:False
 | > db_level:None
 | > stats_path:C:\Users\hanne\AppData\Local\tts\tts_models--es--mai--tacotron2-DDC\scale_stats.npy
 | > base:10
 | > hop_length:256
 | > win_length:1024
 > Model's reduction rate `r` is set to: 1
 > Vocoder Model: fullband_melgan
 > Setting up Audio Processor...
 | > sample_rate:24000
 | > resample:False
 | > num_mels:80
 | > log_func:np.log10
 | > min_level_db:-100
 | > frame_shift_ms:None
 | > frame_length_ms:None
 | > ref_level_db:0
 | > fft_size:1024
 | > power:1.5
 | > preemphasis:0.0
 | > griffin_lim_iters:60
 | > signal_norm:True
 | > symmetric_norm:True
 | > mel_fmin:50.0
 | > mel_fmax:7600.0
 | > pitch_fmin:0.0
 | > pitch_fmax:640.0
 | > spec_gain:1.0
 | > stft_pad_mode:reflect
 | > max_norm:4.0
 | > clip_norm:True
 | > do_trim_silence:True
 | > trim_db:60
 | > do_sound_norm:False
 | > do_amp_to_db_linear:True
 | > do_amp_to_db_mel:True
 | > do_rms_norm:False
 | > db_level:None
 | > stats_path:C:\Users\hanne\AppData\Local\tts\vocoder_models--universal--libri-tts--fullband-melgan\scale_stats.npy
 | > base:10
 | > hop_length:256
 | > win_length:1024
 > Generator Model: fullband_melgan_generator
 > Discriminator Model: melgan_multiscale_discriminator
 * Serving Flask app 'TTS.server.server'
 * Debug mode: off
WARNING: This is a development server. Do not use it in a production deployment. Use a production WSGI server instead.
 * Running on all addresses (::)
 * Running on http://[::1]:5002
 * Running on http://[2003:ec:ef25:f803:4466:2679:9d2e:10d3]:5002
Press CTRL+C to quit
2003:ec:ef25:f803:4466:2679:9d2e:10d3 - - [12/Oct/2022 16:05:25] "GET / HTTP/1.1" 200 -
2003:ec:ef25:f803:4466:2679:9d2e:10d3 - - [12/Oct/2022 16:05:25] "GET /static/coqui-log-green-TTS.png HTTP/1.1" 200 -
 > Model input: Hola mundo.
 > Speaker Idx:
 > Text splitted to sentences.
['Hola mundo.']
[2022-10-12 16:05:46,798] ERROR in app: Exception on /api/tts [GET]
Traceback (most recent call last):
  File "D:\Programs\tts-test\.venv\lib\site-packages\flask\app.py", line 2525, in wsgi_app
    response = self.full_dispatch_request()
  File "D:\Programs\tts-test\.venv\lib\site-packages\flask\app.py", line 1822, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "D:\Programs\tts-test\.venv\lib\site-packages\flask\app.py", line 1820, in full_dispatch_request
    rv = self.dispatch_request()
  File "D:\Programs\tts-test\.venv\lib\site-packages\flask\app.py", line 1796, in dispatch_request
    return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args)
  File "D:\Programs\tts-test\.venv\lib\site-packages\TTS\server\server.py", line 184, in tts
    wavs = synthesizer.tts(text, speaker_name=speaker_idx, style_wav=style_wav)
  File "D:\Programs\tts-test\.venv\lib\site-packages\TTS\utils\synthesizer.py", line 270, in tts
    outputs = synthesis(
  File "D:\Programs\tts-test\.venv\lib\site-packages\TTS\tts\utils\synthesis.py", line 217, in synthesis
    model_outputs = model_outputs[0].data.cpu().numpy()
RuntimeError: Numpy is not available
2003:ec:ef25:f803:4466:2679:9d2e:10d3 - - [12/Oct/2022 16:05:46] "GET /api/tts?text=Hola%20mundo.&speaker_id=&style_wav= HTTP/1.1" 500 -

(This is the log of the second run, so it says that the model is already downloaded, but in the first run it properly fetched the 600 MB or so.)

Expected behavior

No response

Logs

No response

Environment

{
    "CUDA": {
        "GPU": [],
        "available": false,
        "version": null
    },
    "Packages": {
        "PyTorch_debug": false,
        "PyTorch_version": "1.12.1+cpu",
        "TTS": "0.8.0",
        "numpy": "1.22.4"
    },
    "System": {
        "OS": "Windows",
        "architecture": [
            "64bit",
            "WindowsPE"
        ],
        "processor": "Intel64 Family 6 Model 142 Stepping 11, GenuineIntel",
        "python": "3.10.2",
        "version": "10.0.19043"
    }
}

Additional context

No response

thorstenMueller commented 2 years ago

I had some struggles with pyworld dependency on Windows too. But i found a solution and show how to get it up and running here: https://youtu.be/zRaDe08cUIk?t=743

Maybe this helps you.

Manamama commented 2 years ago

My wild-guess tip - try to run: pip2 install numpy (sic) and see what happens.

disseminate commented 2 years ago

I had this issue. Downgrading from python 3.10 to 3.9 fixed it.

Vuizur commented 2 years ago

Thank you all for your help, the solution by @disseminate worked flawlessly for me. I thought about the pyworld library, thanks for the video! I think the maintainers removed the dependency in the newest version (at least for synthesis?), so at the end I didn't install it.

tedmx commented 1 year ago

Thanks @disseminate. Windows 10, was on Python 3.10, can confirm the issue is solved simply after downgrading to Python 3.9

I've used 64-bit installer from this page https://www.python.org/downloads/release/python-3911/

robindegen commented 1 year ago

Is there a better solution than downgrading python?

Anonym0us33 commented 1 year ago

Anyone?

robindegen commented 1 year ago

Nope. I never found a fix other than downgrading python (which isn't a solution) and yet it is considered closed. Normally i'd say its an opensource project and its fine; but this is a product they ask money for so hey....

coqui-ai / TTS