[sil.xmf-latn.mingrelian]: a new lexical model of mingrelian

Meng-Heng commented 6 months ago

Please approve if everything looks good. Thank you!

keyman-server commented 6 months ago

This pull request is from an external repo and will not automatically be built. The build must still be passed before it can be merged. Ask one of the team members to make a manual build of this PR.

darcywong00 commented 5 months ago

This lexical-model is for Latin (Latn) characters. You may need to consult with the community about the following non-Latin characters. I think they're Cyrllic (Cyrl) and Georgian (Geor) characters, and should be removed this wordlist.

Count	Unicode Value	Character
6	0x000430	а
1	0x000431	б
2	0x000432	в
1	0x000434	д
2	0x000435	е
1	0x000437	з
4	0x000438	и
1	0x000439	й
4	0x00043A	к
3	0x00043C	м
2	0x00043D	н
3	0x00043E	о
5	0x000440	р
1	0x000441	с
3	0x000442	т
2	0x000443	у
1	0x000444	ф
1	0x000447	ч
1	0x000448	ш
2	0x00044B	ы
169	0x0010F1	ჱ
1	0x0010F9	ჹ
10	0x0010FA	ჺ

DavidLRowe commented 5 months ago

@darcywong00 is correct 0400-04FF = Cyrillic block 10A0-10FF = Georgian block Entries with these characters should be corrected (or dropped from the .tsv file)

Meng-Heng commented 4 months ago

I have removed the specified characters and parentheses in the wordlist. Thanks, @darcywong00 and @DavidLRowe!

keymanapp / lexical-models

[sil.xmf-latn.mingrelian]: a new lexical model of mingrelian #254