ԓ/л - Githubissues

BasilisAndr / chkchn

чк/чн пишется без мягкого знака

GNU General Public License v3.0

3 stars 0 forks source link

ԓ/л #4

Closed BasilisAndr closed 7 years ago

BasilisAndr commented 7 years ago

1) The Uniparser dictionary doesn't have the ԓ letter at all 2) The grammar and Michael's script say that the letters are fully interchangeable 3) The ckt.crp.txt has both letters in similar contexts: ӄолейӈын гантоԓен @evoling Is this some inconsistency in the corpus or are they not fully interchangeable? Is it ok to just stick to one variant? And if so, @ftyers is it possible to turn, say, ԓ:л in all contexts in analysis but not turn л:ԓ in generation?

ftyers commented 7 years ago

As far as I know when dealing with Chukchi we should just change all л to ԓ in the corpus. I imagine any difference is an OCR error or typo. We should be able to deal with both in analysis (but for the moment I would suggest just changing the corpus.) Let's see what @evoling thinks before going ahead though.

evoling commented 7 years ago

They are fully interchangeable -- the symbol was invented recently as a wholesale replacement of "Russian" л. I agree with Fran, the correct thing to do is to change all л to ԓ.

BasilisAndr commented 7 years ago

Great, thanks!