Closed frank269 closed 3 years ago
Hi @frank269 , I used Montreal Forced Aligner to create textgrid files. Visit https://montreal-forced-aligner.readthedocs.io/en/latest/ for more information.
thank for your response.
I used MFA 2.0 to align the text, but it through the error. How can i generate lexicon for vietnamese? This is an error: montreal_forced_aligner.exceptions.PronunciationAcousticMismatchError: There were phones in the dictionary that do not have acoustic models: a, e, i, u, y, ê, ề
The pretrained acoustic model does not include these phones, in this case, you have to train your own acoustic model. See https://montreal-forced-aligner.readthedocs.io/en/latest/aligning.html#align-using-only-the-data-set for more information.
I tried using the first 6 files and the lexicon file in infore data to align and train with the command: ./bin/mfa_train_and_align MFA/dataset MFA/lexicon.txt MFA/aligned But it only has the first file that has the correct textgrid file, and the other files that give the wrong data, Where did I go wrong?
To train a MFA model, you need: a lexicon file, a wav/text data directory.
The wav/text data directory includes all your audio clips and the transcript files. Each A.wav
clip requires a A.txt
transcript file in the same directory.
Yes, I used the first 6 files and lexicon file in the database infore you provided, I also manually created 6 transcription files for each audio clips. But when I run command train, the output of the first file is correct, but the other files are wrong, here are the results of the first 6 files: 1.zip
It is possible that your dataset is too small, so MFA cannot learn a useful model from that little data.
This is a notebook that I used to align InfoRe data with a slightly different phoneme set https://colab.research.google.com/gist/NTT123/95b12ca42a4bdd1a856aba0fbb0f8936/infore-mfa-tutorial.ipynb
Oh, thank you so much!
I am creating textgrid files for my dataset. Can you guide me how to create that file? Or you can give me information. Thank you so much!