Punctuation preservation method slow when lots of punctuation used (espeak backend)

bootphon / phonemizer

Simple text to phones converter for multiple languages

https://bootphon.github.io/phonemizer/

GNU General Public License v3.0

1.23k stars 172 forks source link

Punctuation preservation method slow when lots of punctuation used (espeak backend) #59

Closed ctlaltdefeat closed 3 years ago

ctlaltdefeat commented 3 years ago

The method used to preserve punctuation for espeak-ng leads to performance that scales linearly with number of punctuation marks because each punctuation split leads to another call to espeak-ng. Unfortunately, espeak-ng has a rather significant overhead which causes this compounding issue to substantially affect downstream applications.

On my machine:

%%time
_ = p.phonemize('a.a.a.a.a.a.a.a.a.a.a.a.a.a.a.a.a.a.a.a.a')
Wall time: 1.04 s

%%time
_ = p.phonemize('a a a a a a a a a a a a a a a a a a a a a')
Wall time: 62.2 ms

Given the difficulty in understanding the internals of espeak-ng, I suggest that an initial way to combat this is calling espeak-ng in parallel within _phonemize_aux and then merging (perhaps in a way that respects njobs).

mmmaat commented 3 years ago

Hi, you're right, calling espeak-ng as a subprocess is far from an ideal solution... But is the easy one.

The better way would be to use the espeak shared library by writing a C/Python wrapper. This would allow to load the espeak related code only once at import time in the phonemizer, instead of for each utterance as in the actual implementation...

ctlaltdefeat commented 3 years ago

Just as an update, calling espeak-ng in parallel has not turned out to be a good solution for me because it seems like the latency of each call is rather volatile for some reason that I do not understand (and thus the latency of the worst call is a lower bound), and in addition using multiprocessing introduces overhead. I'm now trying to work around this by creating persistent processes and communicating with them via stdin and stdout, but it doesn't seem like espeak plays that well with stdin and stdout so I'm not having much success.

mmmaat commented 3 years ago

See https://github.com/rhasspy/espeak-phonemizer for a wrap of libespeak-ng.so using ctypes.

gevmin94 commented 3 years ago

See https://github.com/rhasspy/espeak-phonemizer for a wrap of libespeak-ng.so using ctypes.

I have tried to run above espeak-phonemizer, but seems like I got an error:

Segmentation fault (core dumped)

mmmaat commented 3 years ago

I'm working to integrate ctypes for the espeak backend. Drastic speed improvements. Release in few days...

mmmaat commented 3 years ago

this is now implemented in master branch, feel free to try and let me know if you have any remarks