bootphon / phonemizer

Simple text to phones converter for multiple languages
https://bootphon.github.io/phonemizer/
GNU General Public License v3.0
1.18k stars 165 forks source link

Maximum recursion depth exceeded with --preserve-punctuation and long documents #108

Closed jncasey closed 2 years ago

jncasey commented 2 years ago

Describe the bug If you try to phonemize a long document (say, a Project Gutenberg ebook) and preserve punctuation, phonemize throws fatal error: maximum recursion depth exceeded in comparison

Phonemizer version phonemizer-3.0.1 available backends: espeak-1.48.3, segments-2.2.0 uninstalled backends: espeak-mbrola, festival

System Both macOS 11.6 and Ubuntu 20.10

To reproduce phonemize --preserve-punctuation pg67147.txt where pg67147.txt is this ebook, or anything relatively long.

Expected behavior This looks like a result of the Punctuation restore methods being recursive and my (probably unreasonably long) use case.

Additional context I'm happy to try refactoring the restore methods into a single iterative method. Just let me know if you'd like me to contribute.

spolezhaev commented 2 years ago

Please open PR, I've got the same issue and your refactoring seems to fix it.

jncasey commented 2 years ago

My previous PR for this project (#103, to add a simple new feature) is still pending, and I did my refactoring of the punctuation restore method on a branch off of that work. My git abilities are kind of rusty, so I'm worried that I'd mess something up if I tried to make a new PR independent from my other one. So hopefully the maintainers will review my first PR soon and then I can proceed with my fix for this.