Closed stefanocoretta closed 4 months ago
Responded to this mostly in the email, but for posterity, rules like:
rules:
- following_context: '$'
preceding_context: ''
replacement: ''
segment: 'ə'
- following_context: '$'
preceding_context: ''
replacement: 'ɜ'
segment: 'ə'
- following_context: ''
preceding_context: ''
replacement: ''
segment: 'ə'
- following_context: ''
preceding_context: ''
replacement: 'ɜ'
segment: 'ə'
Should capture the variable schwa realization. For the homorganic nasals, I typically don't use rules if it's always going to assimilate to place, but instead just have ŋ ɡ ɻ a
instead of n ɡ ɻ a
. If there is variability in realization, then using a rule might be better, something like across morpheme boundaries in English "information", "unformed", see https://github.com/MontrealCorpusTools/mfa-models/blob/main/config/acoustic/rules/english_mfa.yaml#L311-L314.
For reference, phone groups for 3.0 trained models are here: https://github.com/MontrealCorpusTools/mfa-models/tree/main/config/acoustic/phone_groups along with their rules here: https://github.com/MontrealCorpusTools/mfa-models/tree/main/config/acoustic/rules, and they'll be updated as I go through languages and train updated 3.0 models.
Hi! Sorry if I am posting here, but I wasn't sure whom to write to.
I am planning to train an acoustic model for Albanian but I am not entirely sure how the phone groups and phonological rules work.
To give an example, in word final position <ë> is variable produced as [ə, ɜ, ʌ] or zero. How would I go about that?
Also, NC clusters in word initial position are phonetically homorganic. Does this mean the dictionary would have for example
n ɡ ɻ a
but I would write a phonological rule to changen
toŋ
before velars?