bootphon / phonemizer

Simple text to phones converter for multiple languages
https://bootphon.github.io/phonemizer/
GNU General Public License v3.0
1.18k stars 165 forks source link

Joss review #92

Closed henrykironde closed 2 years ago

henrykironde commented 2 years ago

@mmmaat, I think it is a great tool and a well-packaged software.

These are some recommendations(Not required):

Qns What would you respond if asked about the performance of the tool? Are there some limitations other than the number of supported languages or some edge cases like encoding, conflict resolutions?

Requested changes:

ref: openjournals/joss-reviews#3958

mmmaat commented 2 years ago

Hi Henry,

Thank you very much for taking time to review phonemizer, for your comments and recommendations. Let me answer point by point. The code version I refeer to is on the joss branch.

Recommendations

About the performance of the tool?

In term of processing speed it depends on which backend is used. Segments is pure Python and reasonably fast. Espeak backend is very fast as it is binded directly to the C library using ctypes (before phonemizer-3.0 it was based on subprocess, calling the espeak binary for each sentence, it was about 30 to 40 times slower than now...). The espeak-mbrola backend is a bit slower because it needs to go through speech wave generation (which is thrown away behind the scene). Finally festival is very slow because it is based on subprocess.

About the phonemizer implementation by itself, performance was not an objective during development. I was focused on stability and usability. For instance the punctuation preservation algorithm could be optimized. But this is not a critical section of the pipeline.

Are there some limitations?

The main limitation I see it that phonemizer depends on the bakend quality. For instance, espeak (which is by far the most used backend) introduces some important drawbacks: it can switch from one language to another during the phonemization of a single sentence, or can merge two words together, or even drop away some words. At the phonemizer level there is nothing we can do except to detect those mistakes and warn the user. Actually most of the issues I get on github are actually issues with espeak, not phonemizer.

Requested changes

Best, Mathieu

henrykironde commented 2 years ago

Closing since all issues are covered. Thanks @mmmaat

henrykironde commented 2 years ago

@mmmaat, I have finished the review, all instructions work fine and easy to follow. All my test cases/scenarios worked fine as expected 👍🏿. One thing that you may want to consider is building the API docs for the whole source in the future but I understand the challenges.

mmmaat commented 2 years ago

Thanks! I know it would be a serious improvement to have a dedicated readthedocs page, it is planned as a middle-term objective.