as-ideas / DeepPhonemizer

Grapheme to phoneme conversion with deep learning.
MIT License
352 stars 38 forks source link

Train sentences #16

Open Frank995 opened 2 years ago

Frank995 commented 2 years ago

Hi. I was able to train an italian model almost perfectly with the exception of few words that are intrinsecally ambiguous without context. Since your model is similar to the bert transformer what do you think would be the best solution to let the model learn word with context? Passing the sentences would be enough? Or a MLM should be implemented?

cschaefer26 commented 2 years ago

Hi, glad it's been working for you. If there is a lot of ambiguity I would say it could work if you feed the whole sentences, albeit quite memory hungry. You could also try to use the autoregressive model and feed each word plus context as input and just the word-phonemes as target (make sure you use some word separator for the context).

Jueun0505 commented 1 year ago

Hi, Thanks a lot for your work!

Relating to this issues, I would like to get some advice on training my model. Specifically, I want to see whether a transformer/autoregressive transformer model could learn liaison in French. To this end, I generated a training data where one or two words are as grapheme and their corresponding phonemic transcription as phoneme listed in each line (e.g., line1: Nous / nu, line2: Nous étions / nuz etjɔ̃, here you can see that liaison /z/ in the word 'nous' occurs in a certain context, in this case /e/ after the first word).

I have trained with the two models and for some reason the model failed to transcribe liaison. e.g., 'je vous en prie.' is transcribed by the trained model as 'ʒə vu ɑ pʁi' whereas the ground truth is 'ʒə- vuz ɑ̃ pʁˈi' with /z/ in it. And the the sentence is, in fact, taken from the training dataset, which means that the model has already seen the sentence.

I have updated text/phoneme symbols in the config file and decreased the batch size to 16. Other than than, all other things remained same as it was. I also checked the number of liaison occurrence to address potential imbalance between cases with liaison and cases without liaison.

Do you think I miss something such that the trained model does not manage to detect the context to produce liaison at all?

Thanks for your advice in advance.