sinaahmadi / klpt

The Kurdish Language Processing Toolkit
https://sinaahmadi.github.io/klpt/
Other
91 stars 11 forks source link

tokenization with word ending with "iy" instead of "îy" #19

Open sinaahmadi opened 1 year ago

sinaahmadi commented 1 year ago

In Kurmanji, words ending with "î" when inflected with a form starting with "î" undergo an alternation where the "îî" becomes "iy" in contrast to "îy". That should be included in the tokenization module to make sure that the correct word forms are looked up in the dictionary.

dîplomasiyê / dîplomasîyê