Open koenvervloesem opened 4 years ago
This is a good point. I need to be careful about naming the voice, etc. derived from these prompts too. But for locales, I usually only see nl_NL or nl_BE. Is there a third more generic one?
Well the voices should definitely have nl_BE
in their name if the speakers are Flemish, but I think the prompts should just use the generic locale nl
, as they can be spoken both by Dutch and Flemish speakers.
However, thinking about these differences I just found some inconsistencies in the dictionary. What's the source of the nl.dict.gz
file? Because this is Northern Dutch pronunciation:
politie p o ˈl i t s i
But this is Flemish pronunciation:
politiebediende p o ˈl i s i b ə ˌd i n d ə
Notice the difference: politie is pronounced p o ˈl i t s i in Northern Dutch and p o ˈl i s i in Flemish.
The pronunciations are coming from the Dutch wiktionary. For "politie", the IPA is / poˈli(t)si /
which contains the optional (t)
. It looks like I need to make the parser generate both forms (with and with the t
) when it encounters optional pieces.
I made sure to weed out most sentences that were too northern Dutch (nl-nl) or too Flemish (nl-be).
However, the name nl-nl seems to be too specific. Shouldn't it be just nl if we want to use this for both language variants?