Closed GoogleCodeExporter closed 9 years ago
Hmm... it's a bit of a problem.
I could always use the Shan fix for this; it works just dandy for re-ordering.
First, though, I want to check if the Zawgyi keyboard has any built-in code for
fixing this.
Original comment by seth.h...@gmail.com
on 10 Nov 2010 at 10:52
Hmm.. the problem runs much deeper than just the Rules file.
Trace:
User typed: u
$row2K[*] => $row2U[$1]
==>u1000
User typed: u1000j
< VK_KEY_J > => $ZWS + $yayit
==>u1000u200bu103c
So, U+1000 U+200B U+103C is correct. However, then the following happens:
1) Filter (U+200B is removed)
2) Convert to Zawgyi (U+103C is re-ordered, since it seems like valid Unicode).
3) Display, and confuse the user.
What's worse is that, although this should be fixed, outputting in Unicode
makes it even worse:
1) Filter (U+200B is removed)
2) Output (U+103C is placed visually before U+1000, since it's valid Unicode).
I think the second problem can't be fixed (for now), due to the weird display
glitches that U+200B causes.
However, if we can fix the first problem, then people will expect weird output
(since they'll see it on-screen).
Is our converter stripping U+200B? That might be the problem.
Original comment by seth.h...@gmail.com
on 10 Nov 2010 at 11:38
Our converter isn't stripping U+200B.
At the line:
src = waitzar::renderAsZawgyi(src);
...then src goes from:
U+1000 U+200B U+1031
...to:
U+1031 U+1000 U+200B U+200B
The extra U+200B is not the problem; including it, the output should look like
this:
U+1000 U+200B(ORIG) U+1031 U+200B(EXTRA)
For some reason, though, U+1031 is being moved despite having U+200B before it.
Original comment by seth.h...@gmail.com
on 10 Nov 2010 at 11:56
Conversion output from "ua", now logged!
Unicode: {\u1000\u200B\u1031}
norm: {\u1000\u200B\u1031\u0}
dash: {\u1000\u200B\uE000\u1031\u0}
stck: {\u1000\u200B\uE000\u1031\u0}
Begin Match
Rule{ORDER, at[\u1031], match[000000000000000000000000000000000000000111], replace[\u0]}
(3,0)
Rule{ORDER, at[\u1031], match[000000000000000000000000000000000000000111], replace[\u0]}
End Match
mtch: {\u1031\u1000\u200B\uE000}
subs: {\u1031\u1000\u200B\u0}
Begin Re-Ordering
End Re-Ordering
Zawgyi1: {\u1031\u1000\u200B}
To put it simply, those ORDER rules shouldn't be there. The second one isn't
applied, so that's fine. The first one puts \u1031 at position 0 (before
\u1000). That's the problem.
Why isn't U+200B causing a "sequence" break?
Original comment by seth.h...@gmail.com
on 10 Nov 2010 at 6:32
Ok, got it! Our "prevConsonant" variable tracks the last known consonant. It
assumes strings will contain only Myanmar text, and will start on a consonant.
We need a way to "reset" this behavior.
Original comment by seth.h...@gmail.com
on 10 Nov 2010 at 6:35
Fixed; just treat U+200B as a "consonant" character. This should be considered
a temporary fix... that entire converter is held together by thread.
Will release a nightly and get the original bug reporter to confirm this is
fixed before I close this bug.
Original comment by seth.h...@gmail.com
on 10 Nov 2010 at 7:27
Update: a lot of this is fixed, but kinzi is not re-orderd properly. So, *F
yields ဂင်္, which puts kinzi after.
This is obviously a problem if you type *F* (ဂင်္ဂ) --you can see
that kinzi will stack after the second "ga".
Original comment by seth.h...@gmail.com
on 13 Nov 2010 at 7:35
Added a great deal of reordering code; most of these issues have been fixed.
Need to test exact Unicode normalization issues for other letters.
Once normalized Unicode words, we can consider a release.
Original comment by seth.h...@gmail.com
on 13 Nov 2010 at 9:08
Complex examples work.
I think we should have a "normalize" feature in KeyMagic. :P
Now, to make a nightly release...
Original comment by seth.h...@gmail.com
on 13 Nov 2010 at 12:39
From Lionslayer:
1) "m key" "B key" "N key" for ra-yit is not still working in Zg kb with all
encodings.
2) After space (for applying texts), we still have to hit another space to get
the space. If space can create an extra space from the start, it will save time
and our typing pattern.
The first one is a real issue (I'm not sure why 'B' and 'N' fail to reorder
ya-yit.)
'M' must be capital (and there's still a glitch).
For the second one, it's more of a usability thing, not a bug.
Original comment by seth.h...@gmail.com
on 16 Nov 2010 at 8:48
Fixed. We require $ZWS before any medial to handle reordering properly.
Spun off 2 into its own bug.
Closing; I'll open new bugs for remaining Zawgyi errors as they pop up.
Original comment by seth.h...@gmail.com
on 16 Nov 2010 at 9:28
Original issue reported on code.google.com by
seth.h...@gmail.com
on 28 Sep 2010 at 10:01