Closed tdeboissiere closed 3 years ago
Hi, not at all. This is a destructive operation as you have homophones (different words but same pronunciation)...
Sorry, my statement was not clear.
The problem assumes that you have both the input text and the IPA transcription. In that case it should not be impacted by homophones.
For instance:
input text: sell cell
output IPA: sˈɛl sˈɛl
-> we know the first sˈɛl corresponds to `sell`, and the second to `cell`
oh OK! So you need a function taking (text, phonemized_text, separator)
as input and outputing a dict word: phonemized_word
.
Well in this case this is surely possible but I never implemented it. I suspect some tricky corner cases for instance with numbers or if espeak "eat" a word or pronounce it in another language, etc...
I do not plan to implement it myself but if you do, you're welcome to do a PR or share it here ;)
Thanks, will let you know if I ever implement it !
@tdeboissiere were you able to implement this method? how did you deal with espeak merging short words like "for the" when phonemizing?
Hi,
is there a straightforward way to map each phoneme back to the word to which it belongs ?
Example: