Zawgyi output is strange for certain combinations

hanmyohtwe / waitzar

Automatically exported from code.google.com/p/waitzar

Other

0 stars 0 forks source link

Zawgyi output is strange for certain combinations #160

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago

Consider the following:

"jujuF" with the Zawgyi keyboard.

The output is correct: "ၾကႀကၤ". But it displays as "ႀကႀကၤ". 
It seems that our converter sees the "-ၤ" before the consonant and tries to 
line it up with the first "ya" as well as the second. Very strange. 

In the previous demo, a ZWS and a hyphen "-" were placed before kinzi... so 
this might be our old Uni2Zg algorithm misbehaving. 

This won't delay the nightly, but it will delay the release.

Original issue reported on code.google.com by seth.h...@gmail.com on 15 Nov 2010 at 6:19

Blocking: #89

GoogleCodeExporter commented 8 years ago

Also for this:
ဝဍ္ဎန 
(unicode encoding, type "<!e")

Original comment by seth.h...@gmail.com on 16 Nov 2010 at 8:51

GoogleCodeExporter commented 8 years ago

First of all, we left some rules "for a future release". These include ~!:

U+100B 100B_stack --> 1097_zg
U+100D 100D_stack --> 106E_zg
U+100D U+1039 U+100E --> 106F_zg
1010_stack 103D --> 1010_103D_stack
U+1039 U+1010 103D --> 1010_103D_stack

Confirming each rule manually:
  ႗
  ၮ
  ၯ
   ႖
(the last 2 rules are essentially the same)

Done! Now on to the mundane bug fix....

Original comment by seth.h...@gmail.com on 23 Nov 2010 at 1:27

GoogleCodeExporter commented 8 years ago

Ah, found it.

Kinzi is being treated as in BOTH syllables, before and after it occurs. Why 
does it straddle this boundary? Probably has something to do with the fact that 
it's the only letter in Unicode that comes before the consonant. 

Time to fix.... whee!

Original comment by seth.h...@gmail.com on 23 Nov 2010 at 2:07

GoogleCodeExporter commented 8 years ago

Fixed. Segmentation is much better. 

We'll need extensive testing to make sure that words aren't broken in general 
though. Yet another reason to switch to Unicode internally....

Original comment by seth.h...@gmail.com on 23 Nov 2010 at 3:00

Changed state: Fixed