coqui-ai / TTS

🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
http://coqui.ai
Mozilla Public License 2.0
34.27k stars 4.15k forks source link

Arm64 Debian 11 Python 3.9: Cannot run TTS because of error 'RuntimeError: Numpy is not available' #2351

Closed fquirin closed 1 year ago

fquirin commented 1 year ago

Describe the bug

I've installed package 'tts' on an Arm64, Debian 11 machine with Python 3.9 in a fresh virtual environment.

Running the test command: tts --text "Hello this is a test." --model_name "tts_models/en/ljspeech/glow-tts" --out_path ./test.wav will download the model and then crash with the following error: RuntimeError: Numpy is not available.

To Reproduce

  1. Get an Arm64 Debian 11 machine with Python 3.9 etc.
  2. Create a fresh virtual environment: python3 -m venv venv && source venv/bin/activate
  3. Upgrade pip: pip3 install --upgrade pip
  4. Install tts: pip3 install tts
  5. Run the command: tts --text "Hello this is a test." --model_name "tts_models/en/ljspeech/glow-tts" --out_path ./test.wav
  6. See error

Expected behavior

test.wav is generated successfully.

Logs

~/coqui-tts $ tts --text "Hello this is a test." --model_name "tts_models/en/ljspeech/glow-tts" --out_path ./test.wav
 > tts_models/en/ljspeech/glow-tts is already downloaded.
 > vocoder_models/en/ljspeech/multiband-melgan is already downloaded.
 > Using model: glow_tts
/home/pi/coqui-tts/venv/lib/python3.9/site-packages/torchaudio/compliance/kaldi.py:22: UserWarning: Failed to initialize NumPy: module compiled against API version 0xf but this version of numpy is 0xe (Triggered internally at /root/pytorch/torch/csrc/utils/tensor_numpy.cpp:77.)
  EPSILON = torch.tensor(torch.finfo(torch.float).eps)
 > Setting up Audio Processor...
 | > sample_rate:22050
 | > resample:False
 | > num_mels:80
 | > log_func:np.log10
 | > min_level_db:-100
 | > frame_shift_ms:None
 | > frame_length_ms:None
 | > ref_level_db:0
 | > fft_size:1024
 | > power:1.1
 | > preemphasis:0.0
 | > griffin_lim_iters:60
 | > signal_norm:False
 | > symmetric_norm:True
 | > mel_fmin:50.0
 | > mel_fmax:7600.0
 | > pitch_fmin:1.0
 | > pitch_fmax:640.0
 | > spec_gain:1.0
 | > stft_pad_mode:reflect
 | > max_norm:1.0
 | > clip_norm:True
 | > do_trim_silence:True
 | > trim_db:60
 | > do_sound_norm:False
 | > do_amp_to_db_linear:True
 | > do_amp_to_db_mel:True
 | > do_rms_norm:False
 | > db_level:None
 | > stats_path:None
 | > base:10
 | > hop_length:256
 | > win_length:1024
 > Vocoder Model: multiband_melgan
 > Setting up Audio Processor...
 | > sample_rate:22050
 | > resample:False
 | > num_mels:80
 | > log_func:np.log10
 | > min_level_db:-100
 | > frame_shift_ms:None
 | > frame_length_ms:None
 | > ref_level_db:0
 | > fft_size:1024
 | > power:1.5
 | > preemphasis:0.0
 | > griffin_lim_iters:60
 | > signal_norm:True
 | > symmetric_norm:True
 | > mel_fmin:50.0
 | > mel_fmax:7600.0
 | > pitch_fmin:0.0
 | > pitch_fmax:640.0
 | > spec_gain:1.0
 | > stft_pad_mode:reflect
 | > max_norm:4.0
 | > clip_norm:True
 | > do_trim_silence:True
 | > trim_db:60
 | > do_sound_norm:False
 | > do_amp_to_db_linear:True
 | > do_amp_to_db_mel:True
 | > do_rms_norm:False
 | > db_level:None
 | > stats_path:/home/pi/.local/share/tts/vocoder_models--en--ljspeech--multiband-melgan/scale_stats.npy
 | > base:10
 | > hop_length:256
 | > win_length:1024
 > Generator Model: multiband_melgan_generator
Traceback (most recent call last):
  File "/home/pi/coqui-tts/venv/bin/tts", line 8, in <module>
    sys.exit(main())
  File "/home/pi/coqui-tts/venv/lib/python3.9/site-packages/TTS/bin/synthesize.py", line 316, in main
    synthesizer = Synthesizer(
  File "/home/pi/coqui-tts/venv/lib/python3.9/site-packages/TTS/utils/synthesizer.py", line 78, in __init__
    self._load_vocoder(vocoder_checkpoint, vocoder_config, use_cuda)
  File "/home/pi/coqui-tts/venv/lib/python3.9/site-packages/TTS/utils/synthesizer.py", line 147, in _load_vocoder
    self.vocoder_model = setup_vocoder_model(self.vocoder_config)
  File "/home/pi/coqui-tts/venv/lib/python3.9/site-packages/TTS/vocoder/models/__init__.py", line 31, in setup_model
    return MyModel.init_from_config(config)
  File "/home/pi/coqui-tts/venv/lib/python3.9/site-packages/TTS/vocoder/models/gan.py", line 374, in init_from_config
    return GAN(config, ap=ap)
  File "/home/pi/coqui-tts/venv/lib/python3.9/site-packages/TTS/vocoder/models/gan.py", line 41, in __init__
    self.model_g = setup_generator(config)
  File "/home/pi/coqui-tts/venv/lib/python3.9/site-packages/TTS/vocoder/models/__init__.py", line 55, in setup_generator
    model = MyModel(
  File "/home/pi/coqui-tts/venv/lib/python3.9/site-packages/TTS/vocoder/models/multiband_melgan_generator.py", line 27, in __init__
    self.pqmf_layer = PQMF(N=4, taps=62, cutoff=0.15, beta=9.0)
  File "/home/pi/coqui-tts/venv/lib/python3.9/site-packages/TTS/vocoder/layers/pqmf.py", line 30, in __init__
    H = torch.from_numpy(H[:, None, :]).float()
RuntimeError: Numpy is not available
### Environment

```shell
{
    "CUDA": {
        "GPU": [],
        "available": false,
        "version": null
    },
    "Packages": {
        "PyTorch_debug": false,
        "PyTorch_version": "1.13.1",
        "TTS": "0.11.1",
        "numpy": "1.21.6"
    },
    "System": {
        "OS": "Linux",
        "architecture": [
            "64bit",
            "ELF"
        ],
        "processor": "",
        "python": "3.9.2",
        "version": "#1488 SMP PREEMPT Thu Nov 18 16:16:16 GMT 2021"
    }
}

Additional context

tocaRepo commented 1 year ago

i am having the same issue. i'm using python 3.10

" UserWarning: Failed to initialize NumPy: module compiled against API version 0x10 but this version of numpy is 0xf"

i tried with numpy 1.22 or 1.21.6 but same issues

fquirin commented 1 year ago

I've managed to build an aarch64 Python 3.10 Docker container with torch 1.13.1, numpy 1.22.4 and numba 0.55.2 that works. Here is the full list of packages, maybe that helps:

Python 3.10.10
Debian 11 - 5.10.63-v8+
Aarch64

Package               Version     Editable project location
--------------------- ----------- -------------------------
anyascii              0.3.1
appdirs               1.4.4
audioread             3.0.0
Babel                 2.11.0
certifi               2022.12.7
cffi                  1.15.1
charset-normalizer    3.0.1
click                 8.1.3
contourpy             1.0.7
coqpit                0.0.17
cycler                0.11.0
Cython                0.29.28
dateparser            1.1.7
decorator             5.1.1
docopt                0.6.2
Flask                 2.2.3
fonttools             4.38.0
fsspec                2023.1.0
g2pkk                 0.1.2
gruut                 2.2.3
gruut-ipa             0.13.0
gruut-lang-de         2.0.0
gruut-lang-en         2.0.0
idna                  3.4
inflect               5.6.0
itsdangerous          2.1.2
jamo                  0.4.1
jieba                 0.42.1
Jinja2                3.1.2
joblib                1.2.0
jsonlines             1.2.0
kiwisolver            1.4.4
librosa               0.8.0
llvmlite              0.38.1
MarkupSafe            2.1.2
matplotlib            3.7.0
mecab-python3         1.0.5
networkx              2.8.8
nltk                  3.8.1
num2words             0.5.12
numba                 0.55.2
numpy                 1.22.4
packaging             23.0
pandas                1.5.3
Pillow                9.4.0
pip                   23.0
pooch                 1.6.0
protobuf              3.19.6
psutil                5.9.4
pycparser             2.21
pynndescent           0.5.8
pyparsing             3.0.9
pypinyin              0.48.0
pysbd                 0.3.4
PySoundFile           0.9.0.post1
python-crfsuite       0.9.9
python-dateutil       2.8.2
pytz                  2022.7.1
pytz-deprecation-shim 0.1.0.post0
PyYAML                6.0
regex                 2022.10.31
requests              2.28.2
resampy               0.4.2
scikit-learn          1.2.1
scipy                 1.10.0
setuptools            65.5.0
six                   1.16.0
soundfile             0.12.1
tensorboardX          2.6
threadpoolctl         3.1.0
torch                 1.13.1
torchaudio            0.13.1
tqdm                  4.64.1
trainer               0.0.20
TTS                   0.11.1      /home/share/tts/TTS
typing_extensions     4.5.0
tzdata                2022.7
tzlocal               4.2
umap-learn            0.5.1
unidic-lite           1.0.8
urllib3               1.26.14
Werkzeug              2.2.3
fquirin commented 1 year ago

Just reproduced the same error on a brand new Armbian installation, Python 3.9.2, aarch64, Kernel 5.10

fquirin commented 1 year ago

I've built a Coqui-TTS Docker image using Python 3.10 that works on aarch64/arm64: https://hub.docker.com/r/sepia/coqui-tts . Still can't get any Python 3.9 version to run TTS.