as-ideas / DeepPhonemizer

Grapheme to phoneme conversion with deep learning.
MIT License
346 stars 38 forks source link

Include stresses preidction #13

Open stasbel opened 2 years ago

stasbel commented 2 years ago

Hi, @cschaefer26 Cool lib!

I was just wondering: any particular reason you don't include stresses prediction into pipeline? Both "cmudict-ipa" and "wikipron" has stresses labelling included. Phoneme tokenizers from pretrained checkpoints lack ' and , symbols (this was probably done due to collision with puctuation, but it's pretty easy to avoid).

cschaefer26 commented 2 years ago

Hi stasbel,

the stresses are intentionally excluded as they are quite hard to predict and make the overall result worse (they are also commonly excluded from benchmarks in the literature). If you want to train a model with stresses you can simply add them to the symbols and proceed with preprocessing / training. If I have time I will try to train a model purely on stress prediction (phonemes in, phonemes + stress out) which I believe would make the overall performance quite good.

stasbel commented 2 years ago

this is very interesting, as stresses are very important for number of tasks looking forward to hear from you!

lorinczb commented 2 years ago

Hi @cschaefer26, I have added the numbers (that mark the stress) to the symbols as you suggested above, and changed the list of phones to include the phones with accents, but at prediction I still get the unaccented phones. Sorry, I have not spent a lot of time on looking into the model, but maybe you have a hint on why that might happen.

cschaefer26 commented 2 years ago

Hi, did you preprocess the data with the updated config and train a new model? You could check whether the processed data looks correct in datasets/combined_dataset.txt