reynoldsnlp / udar

UDAR Does Accented Russian: A finite-state morphological analyzer of Russian that handles stressed wordforms.
GNU General Public License v3.0
26 stars 1 forks source link

improve `guess_syllable()` #6

Open reynoldsnlp opened 5 years ago

reynoldsnlp commented 5 years ago

The intuition behind it is to place stress on the last syllable of the stem, but it fails for multisyllabic endings.

In udar, guess_syllable() is only ever used as a backoff approach. There are now several machine-learning approaches to this problem, and it would be great to implement one of them instead of this crappy algorithm.