xinjli / transphone

phoneme tokenizer and grapheme-to-phoneme model for 8k languages
MIT License
143 stars 13 forks source link

Stress a vowel manually #12

Open NikitaKononov opened 1 year ago

NikitaKononov commented 1 year ago

Hello Thanks for your exciting work! Can you please tell me, is there a possibility to stress a vowel manually with you phonemizer? For example: alibab'a / alib'aba, з'амок / зам'ок And does your phonemizer support stresses at all?

xinjli commented 1 year ago

thanks for your question!

I did not include the stress symbol during the training, so currently, it will not predict any stress on it. For a few languages, I think the stress information is in the training set but I removed them before training. Depending on the language, it might be possible to support stress in the future.

Can you tell me what's your application of using stress? and which language do you want to apply?

NikitaKononov commented 1 year ago

Thanks for the quick response

Can you tell me what's your application of using stress? and which language do you want to apply?

If your solution would support manual stress setting, I would use it for the following tasks:

  1. Phonemizing the input texts of text-to-speech models
  2. Phonemizing text for training phoneme level BERT (for text to speech tasks too)

Languages: English, Slavic (Polish, Russian etc.) Manual stress correction is very important for correct phonemizing of proper names in English For Slavic languages it's critically important, same sequences of characters can have different meanings depending on the stress. But old tools like espeak unfortunately don't have an ability to control stress manually

xinjli commented 1 year ago

For English, I think it is possible to train a model supporting stress as it is included in the CMU dictionary. I am not very sure whether the Slavic languages also have these annotations there, it would be difficult if there are no datasets containing stress. Do you know any pronunciation datasets supporting stress?

NikitaKononov commented 1 year ago

For English, I think it is possible to train a model supporting stress as it is included in the CMU dictionary. I am not very sure whether the Slavic languages also have these annotations there, it would be difficult if there are no datasets containing stress. Do you know any pronunciation datasets supporting stress?

I can try to find them Can you please give an example piece of dataset? It should be like word / phoneme form? and stress must be set with char + or '?

xinjli commented 1 year ago

it should be something like the following format with fields delimited by space

word phoneme1 phoneme2 stress1 phoneme3

Currently, I removed all non IPA symbols, so there need some code changes to include your stress symbol, I think you can assign whatever character you think is appropriate.