Syllabification Module Adjusment

ikhfa commented 7 years ago

I have a little problem with the syllabification algorithm. For a quick example, in my language (indonesian), the word 'maaf' pronounced 'm a0 a0 f' and should be syllabified as 'm+a0 | a0+f'. But using current syllabification module, it'll be syllabified as 'm+a0+a0_f'. How could i adjust this? Because it resulted in 'Merge of alignment and contexts fail' at the alignment process, since the number of phones contexts and alignment mismatched. Thanks.

ikhfa commented 7 years ago

If not, it would be nice if we could use some kind of lexicon for syllabification :)

ghost commented 7 years ago

Same problems here, I had changed the audio database to my language along with the transcription (label). The result was error on the alignment. How could I adjust the language model to another language? Or at least the format. Thanks..

matthewaylett commented 7 years ago

In order for syllabification to work you have two options:

use the lex pron attribute to force sylabification:

e.g. maaf

Add a lexicon and syllabififcation rules for the new language. This is really how you should do it.

All data for English GA acent is stored in idlak-data/en/ga

en is the iso standard for English ga is not a standard 2 letter code for an accent.

So for Indonesian it would be:

id/id - assuming just one accent :-)

You can copy all the english data into both and start editing. lexicon-default.xml sylmax-default.xml Are the lexcion and the maximal onset rules (allowing you to specify valid syllable nuclie and onsets).

Sylabification can be over ridden in the lexicon in the same way as using the pron tag if max onset is not good for you:

Looking at it there is a bug in max onset (its not triggering a syllable boundary between repeating nuclei) which I will look at shortly.

v best

Matthew

matthewaylett commented 7 years ago

There was a bug that skipped a second nucleus, this is now fixed.

By editing sylmax-default.xml you can specify which phones are syllabic and list EITHER a set of valid nuclei with or without stress (for languages which have nuclei > 1 phone).

The algorithm finds valid nuclei, then runs back to find a valid onset, then regards whats left as a coda.

bpotard commented 7 years ago

Please let me know if https://github.com/bpotard/idlak/commit/a3e85e210cbb38e683118064e838f6b9626e30a2 fixes the issue. Thanks!

ghost commented 7 years ago

This is what I get from the latest commit for cmu_slt_arctic. But, the synthesize process run without problem.

WARNING (make-fullctx-ali-dnn:main():make-fullctx-ali-dnn.cc:244) Merge of alignment and contexts failed for key slt_arctic_a0086 mismatching number of phones contexts:28 alignment:29
WARNING (make-fullctx-ali-dnn:main():make-fullctx-ali-dnn.cc:244) Merge of alignment and contexts failed for key slt_arctic_a0438 mismatching number of phones contexts:33 alignment:34
WARNING (make-fullctx-ali-dnn:main():make-fullctx-ali-dnn.cc:244) Merge of alignment and contexts failed for key slt_arctic_a0439 mismatching number of phones contexts:31 alignment:32
WARNING (make-fullctx-ali-dnn:main():make-fullctx-ali-dnn.cc:244) Merge of alignment and contexts failed for key slt_arctic_b0244 mismatching number of phones contexts:31 alignment:32
WARNING (make-fullctx-ali-dnn:main():make-fullctx-ali-dnn.cc:244) Merge of alignment and contexts failed for key slt_arctic_b0351 mismatching number of phones contexts:27 alignment:28
WARNING (make-fullctx-ali-dnn:main():make-fullctx-ali-dnn.cc:244) Merge of alignment and contexts failed for key slt_arctic_b0391 mismatching number of phones contexts:34 alignment:35

bpotard commented 7 years ago

Hi,

I don't think the alignment merging issues in arctic are due to syllabification problems:

they can be due to problems in the recordings that affect the alignment - for example, if you look at slt_arctic_a0086 you will see that the recording ends in the middle of a phone, so there is no final silence in the alignment to align the labels with.
another issue is the very basic number normalizer currently present in the idlak front-end, e.g. in slt_arctic_a0438 where "16" is transformed as "one six" by the front-end instead of "sixteen", and "1908" is expected to be "one nine oh eight"
the main issue at the moment is in the script that re-generate the phone transcription from the alignment (idlak_make_lang.py --mode 1) that can not deal correctly with repeated phones (this affects every single failing examples above - so that would be a real problem for a language where repeated phones are common).

I'll try to make a fix for the idlak_make_lang.py --mode 1 tool to be using the state level alignment - so that we can handle duplicated phones - and hopefully that will remove these warnings!

ghost commented 7 years ago

I modify the lexicon-default.xml according to my language (Indonesia) and make sure there are not duplicated phone. I convert Indonesian phoneme to English phoneme, so I have all the Indonesian words phonetized in English. But after running the recipe, I get this warning and the process exits immediately.

WARNING (paste-feats:AppendFeats():paste-feats.cc:45) Length mismatch 636 vs. 0 for utt id_mmht_a0002 exceeds tolerance 1
WARNING (paste-feats:AppendFeats():paste-feats.cc:45) Length mismatch 652 vs. 0 for utt id_mmht_a0012 exceeds tolerance 1
WARNING (paste-feats:AppendFeats():paste-feats.cc:45) Length mismatch 632 vs. 0 for utt id_mmht_a0022 exceeds tolerance 1
WARNING (paste-feats:AppendFeats():paste-feats.cc:45) Length mismatch 709 vs. 0 for utt id_mmht_a0032 exceeds tolerance 1
....

Did I miss a step?

Thank you,

ghost commented 7 years ago

I am sorry. It seems the feature have not been created properly. My only problem is the feature extraction. After reading issues 8, the warning disappear.

bpotard commented 6 years ago

The issue with idlak_make_lang.py --mode 1 has now been fixed; so I'll close this for now.

bpotard / idlak

Syllabification Module Adjusment #11