bootphon / phonemizer

Simple text to phones converter for multiple languages
https://bootphon.github.io/phonemizer/
GNU General Public License v3.0
1.15k stars 163 forks source link

Where/How to get a full list of phone set used for each language? #131

Closed XuesongYang closed 2 years ago

XuesongYang commented 2 years ago

Thank you for the great tool. I am using it to make a G2P-like conversion. But I did not find what kind of complete list of phone set is used. I know the backend engine, like espeak, pre-defined such a phone set. I wonder if we could get it through phonemizer directly. Thanks.

hadware commented 2 years ago

As of now, this isn't possible. If you're using espeak as a phonemizer backend, you might want to look into their documentation.

@mmmaat do you concur?

mmmaat commented 2 years ago

Hi, indeed the explicit list of phonemes is not provided. For espeak this seems to be stored in a specific file for each language. inside the phsource folder (for French here). Should not be too hard to write a parser for that file format for the extraction of the phonemes list.