bootphon / phonemizer

Simple text to phones converter for multiple languages
https://bootphon.github.io/phonemizer/
GNU General Public License v3.0
1.23k stars 169 forks source link

Segmentation fault when use EspeakBackend to phonemize text. #180

Open XqZeppelinhead0702 opened 1 week ago

XqZeppelinhead0702 commented 1 week ago

Describe the bug I encountered segmentation fault when I tried to run the inference code in StyleTTS2. After I checked out the code with pdb tools I found that when the code attempt to phonemize the text with EspeakBackend segmentation fault occured. Then I tried the following test code and the same error happened again.

from phonemizer.backend import EspeakBackend
from nltk.tokenize import word_tokenize

a = "I'm a test file."
global_phonemizer = EspeakBackend(language='en-us', preserve_punctuation=True,  with_stress=True)
text = a.strip()
ps = global_phonemizer.phonemize([text]) # segmentation fault happened when performed this line
ps = word_tokenize(ps[0])
ps = ''.join(ps)

print(f"ps: {ps}")

Phonemizer version My phonemizer version is 3.3.0 .

System My OS is CentOS-7 and my Python version is 3.9.20 . Other python package version is as follows:

Package                   Version
------------------------- ------------
accelerate                1.0.1
attrs                     24.2.0
audioread                 3.0.1
babel                     2.16.0
bibtexparser              2.0.0b7
certifi                   2024.8.30
cffi                      1.17.1
charset-normalizer        3.4.0
click                     8.1.7
clldutils                 3.22.2
colorama                  0.4.6
colorlog                  6.8.2
contourpy                 1.3.0
csvw                      3.3.1
cycler                    0.12.1
decorator                 5.1.1
dlinfo                    1.2.1
einops                    0.8.0
einops-exts               0.0.4
filelock                  3.16.1
fonttools                 4.54.1
fsspec                    2024.9.0
huggingface-hub           0.25.2
idna                      3.10
importlib_metadata        8.5.0
importlib_resources       6.4.5
isodate                   0.6.1
Jinja2                    3.1.4
joblib                    1.4.2
jsonschema                4.23.0
jsonschema-specifications 2024.10.1
kiwisolver                1.4.7
language-tags             1.2.0
lazy_loader               0.4
librosa                   0.10.2.post1
llvmlite                  0.43.0
lxml                      5.3.0
Markdown                  3.7
MarkupSafe                3.0.1
matplotlib                3.9.2
monotonic_align           1.2
mpmath                    1.3.0
msgpack                   1.1.0
munch                     4.0.0
networkx                  3.2.1
nltk                      3.9.1
numba                     0.60.0
numpy                     2.0.2
nvidia-cublas-cu12        12.1.3.1
nvidia-cuda-cupti-cu12    12.1.105
nvidia-cuda-nvrtc-cu12    12.1.105
nvidia-cuda-runtime-cu12  12.1.105
nvidia-cudnn-cu12         9.1.0.70
nvidia-cufft-cu12         11.0.2.54
nvidia-curand-cu12        10.3.2.106
nvidia-cusolver-cu12      11.4.5.107
nvidia-cusparse-cu12      12.1.0.106
nvidia-nccl-cu12          2.20.5
nvidia-nvjitlink-cu12     12.6.77
nvidia-nvtx-cu12          12.1.105
packaging                 24.1
phonemizer                3.0.1
pillow                    10.4.0
pip                       24.2
platformdirs              4.3.6
pooch                     1.8.2
psutil                    6.0.0
pycparser                 2.22
pydub                     0.25.1
pylatexenc                2.10
pyparsing                 3.2.0
python-dateutil           2.9.0.post0
PyYAML                    6.0.2
rdflib                    7.0.0
referencing               0.35.1
regex                     2024.9.11
requests                  2.32.3
rfc3986                   1.5.0
rpds-py                   0.20.0
safetensors               0.4.5
scikit-learn              1.5.2
scipy                     1.13.1
segments                  2.2.1
setuptools                75.1.0
six                       1.16.0
soundfile                 0.12.1
soxr                      0.5.0.post1
sympy                     1.13.3
tabulate                  0.9.0
threadpoolctl             3.5.0
tokenizers                0.20.1
torch                     2.4.1
torchaudio                2.4.1
tqdm                      4.66.5
transformers              4.45.2
triton                    3.0.0
typing                    3.7.4.3
typing_extensions         4.12.2
uritemplate               4.1.1
urllib3                   2.2.3
wheel                     0.44.0
zipp                      3.20.2

To reproduce Already demonstrate the example in the bug description part.

Expected behavior I hope to use the EspeakBackend to phonemize text normally so as to run the test code.

Additional context No exactly.

mmmaat commented 1 week ago

Hi, this is most probably a bug related to the way you installed espeak. I can run you code sample without trouble. You may want to see here, espeak-ng is available on yum but centos7 is terribly outdated and you may have a very old version of espeak. Current one is espeak-ng-1.51. To upgrade it you need to compile it from sources.

XqZeppelinhead0702 commented 1 week ago

@mmmaat Thank you for providing explanation! In fact I notice that my espeak-ng version is 1.47.11. Moreover, I usually run my code on a slurm system and I don't have the root privilege to update the espeak-ng. So I want to ask if there's another way to update the espeak-ng or solve the bug.

mmmaat commented 1 week ago

you can compile the latest version from sources, no need to be root for that. Once compiled, provide the path to the compiled espeak-ng.so in the PHONEMIZER_ESPEAK_LIBRARY environment variable. For instance, add it to your ~/.bashrc file...

You can also test espeak without phonemizer, for instance enter the command espeak-ng -v en-us --ipa -x "this is a test"