roedoejet / g2p

Grapheme-to-Phoneme transductions that preserve input and output indices, and support cross-lingual g2p!
https://g2p-studio.herokuapp.com
Other
128 stars 27 forks source link

context_after="\s|$" does not always work correctly for end of word #134

Open joanise opened 2 years ago

joanise commented 2 years ago

In some mappings, we use context_after = \s|$ to do some processing on the end of a word.

Examples:

French:

Mi'kmaq:

Several other mappings use $ one way or another.

Not sure what the best solution is. \b is also not always right, (e.g., it's incompatible with prevent_feeding). It fixes French, in any case.

joanise commented 1 year ago

@roedoejet not sure if #277 fixes this or not. I'm pretty sure it will make rule writing easier, but I expect there are still corner cases where "end of word" will remain difficult to define reliably.