keymanapp / lexical-models

Lexical language models for predictive text
MIT License
13 stars 38 forks source link

bug(sil.km.cnd): is `សាន់វុ័ណ្ត` valid encoding? #288

Closed mcdurdin closed 2 weeks ago

mcdurdin commented 3 weeks ago

សាន់វុ័ណ្ត on line 11570 has the sequence 179C 17BB 17D0 which doesn't look valid to me (and is flagged by Khmer Normalizer).

image

@MakaraSok can you confirm?

mcdurdin commented 2 weeks ago

@Nnyny can you resolve this? It should be សាន់វ៉័ណ្ត

Nnyny commented 2 weeks ago

I'll look into it, Marc.

MakaraSok commented 2 weeks ago

វ៉័ rendered correctly on Safari, macOS 18.1. Both orders rendered identically this platform.

វុ័ វ ុ ័ វ៉័ វ ៉ ័

image
MakaraSok commented 2 weeks ago

@Nnyny Like ប៉័ង, this word should use the appropriate consonant shifter rather than the below vowel (ុ).

mcdurdin commented 2 weeks ago

Leelawadee UI font: (1) សាន់វ៉័ណ្ត vs (2) សាន់វុ័ណ្ត (1) has correct encoding 179C 17C9 17D0, but renders wrongly (2) has incorrect encoding 179C 17BB 17D0; this is rendered correctly

Khmer UI, MoolBoran, DaunPenh all render as expected. Reference rendering:

image

Image of Leelawadee UI:

image

MS Feedback hub: https://aka.ms/AAtdpmo