amosogra / hackerskeyboard

Automatically exported from code.google.com/p/hackerskeyboard
0 stars 0 forks source link

Thai language support (keymap/dict attached) #72

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
Support for Thai language input would be really great!

I've attached a donottranslate-keymap.xml for a 5 row keyboard (Thai uses all 5 
rows on a standard keyboard - there are letters on the un-shifted number row 
keys too). A 5,000 word dictionary file's included too, although if I find time 
I'll try to get a better (30k+ word) one together.

However, Thai input works a bit different from Roman-based languages as it 
doesn't use spaces to separate words except in specific circumstances so I'd 
like to propose and additional preferences setting for a behaviour change so 
that:

- if a dictionary suggestion is highlighted, pressing space will deselect it 
but not insert a space

- if no dictionary suggestion is highlighted, pressing space will insert a space

The keyboard would would without this behaviour change but it would not be so 
useful to language learners, like myself.

Hopefully that makes sense!

Thanks in advance.

Mark

Original issue reported on code.google.com by hmm...@gmail.com on 1 Sep 2011 at 10:37

Attachments:

GoogleCodeExporter commented 9 years ago
Sorry about not replying earlier - thanks for your contribution! I'll look into 
integrating it. I assume a 4-row layout wouldn't easily be possible? Are any 
characters sufficiently rare that it would work to put them in long-press 
alternates, or would that be too annoying to be practical?

I'm working on some changes that will make it easier to do language-specific 
behavior changes such as the space handling you mention.

Original comment by Klaus.We...@gmail.com on 25 Sep 2011 at 11:03

GoogleCodeExporter commented 9 years ago
Added in revision 375a31fa5d70ef61261ff2b2fdd5fe6cf4516e1d, small tweaks in 
revision f9c1e31b3bfddd7e635c6d28d4a9f0e9d2570f26 .

Try v1.21rc6 (or later) from 
http://code.google.com/p/hackerskeyboard/downloads/list

I haven't built a dictionary so far, but that can be added separately without 
needing code changes.

BTW, are the keys in the middle of the keymap dead keys that are intended to be 
combined with other glyphs, not used standalone? That functionality isn't fully 
working yet in the keyboard, see issue 39.

Original comment by Klaus.We...@gmail.com on 25 Sep 2011 at 11:26

GoogleCodeExporter commented 9 years ago
The layout is included in yesterday's Market v1.22 release. Please let me know 
if it's working for you.

Still open: dictionary, and possibly dead key support (if needed).

Original comment by Klaus.We...@gmail.com on 26 Sep 2011 at 9:39

GoogleCodeExporter commented 9 years ago
Hi,

Attached is an updated donottranslate-keymap.xml file which should fix a 
problem with the W key (the double-quote mark should have been escaped) and 
after using the layout for a while I've changed some of the alt options too 
which should make it easier to use.

A per-language auto-capitalization option would be useful as Thai don't have 
capital letters and this should be disabled for such keyboard layouts. But it 
is useful, of course, when switching back to English etc.

A full dictionary file is in progress - I'll send it on shortly.

Also, for your android market description it may help end users if you can 
include the words ภาษาไทย (Thai language) in the description to 
help those who are using Thai keywords for app searches.

Thanks,
Mark

Original comment by hmm...@gmail.com on 12 Oct 2011 at 11:57

Attachments:

GoogleCodeExporter commented 9 years ago
In addition to the previous comment (updated keymap file) here's a 30,000+ word 
Thai dictionary (gzip'ed 195KB, uncompressed 1.1MB) based on the LEXiTRON "open 
source" Thai-English dictionary which is free for both commercial and 
non-commercial use. It does, however, require an acknowledgement in the 
application:

(c)Any product, created and redistributed by you, which is composed of any part 
from
LEXiTRON must include the following acknowledgement:
"This product is created by the adaptation of LEXiTRON developed by NECTEC 
(http://www.nectec.or.th/)."

The full license text is included in the xml file.

Mark

Original comment by hmm...@gmail.com on 12 Oct 2011 at 2:14

Attachments:

GoogleCodeExporter commented 9 years ago
I've applied your keymap changes, see revision 280b4b0e8ba4, including getting 
rid of the unneeded LSGT key.

Please try version v1.28rc1 (or later) from 
http://code.google.com/p/hackerskeyboard/downloads/list and let me know if that 
works better for you. This also includes the modifier key fix from issue 114.

Original comment by Klaus.We...@gmail.com on 22 Nov 2011 at 10:37

GoogleCodeExporter commented 9 years ago
I've finally uploaded a dictionary package for Thai based on the XML file you 
sent:

http://hackerskeyboard.googlecode.com/files/dict-th-release.apk

Can you please try it and let me know if it works for you?

Thanks to Martin Ratajský for doing the conversion and for prodding me towards 
looking into the issue.

Original comment by Klaus.We...@gmail.com on 25 Jan 2012 at 8:07

GoogleCodeExporter commented 9 years ago
Hi, The dictionary works well and relevant suggestions are shown. However, when 
selecting a word a space should not be inserted after it as the Thai language 
only uses spaces to separate phrases & sentences, but not words. Many thanks.

Original comment by hmm...@gmail.com on 26 Jan 2012 at 12:41

GoogleCodeExporter commented 9 years ago
This may be hard to fix properly. While it would be fairly easy to stop adding 
the space, I think this will result in the prediction getting confused since it 
assumes that it can tell where words start based on separating space or 
punctuation.

For example, if you type two words in a row without a space in between, it will 
look in the dictionary for the entire sequence entered, and won't be able to 
find useful suggestions for the second word. Also, recorrection (moving the 
cursor onto a previously entered word) won't work. Basically, it'll try to 
treat whole phrases/sentences as single words in many cases and get confused.

It appears that the rules for recognizing word boundaries are rather 
complicated - I found this page which lists some of them: 
http://www.thai-language.com/ref/breaking-words . While it appears to be an 
interesting computational linguistics problem, I'm not keen on implementing 
that.

If you think it would be useful despite the restrictions, I can try adding the 
special case for not adding a space.

Alternatively, I think it would potentially work better to use a zero-width 
space (U+200B) to separate words - this would not leave visible gaps, but would 
help the completion logic in finding words, and would also allow line breaks at 
these points. The downside is that some applications may not expect these 
additional separators in the text, do you happen to know if something like this 
is in common use for Thai text?

Original comment by Klaus.We...@gmail.com on 26 Jan 2012 at 8:56

GoogleCodeExporter commented 9 years ago
I've changed the code to not insert spaces after Thai words but haven't tested 
this, can you try v1.30rc5 (or later) from 
http://code.google.com/p/hackerskeyboard/downloads/list ?

Do you think that's an improvement overall, or does it cause other issues?

(Revision cf6c14ac1a5f)

Original comment by Klaus.We...@gmail.com on 21 Feb 2012 at 8:57

GoogleCodeExporter commented 9 years ago
[bulk bug update]

The changes from the 1.30rcX prerelease series are included in version 1.31 as 
published on Android Market, and this bug should be fixed. If it's still not 
working for you, please reopen or file a new bug. Thanks to everyone who helped 
with finding bugs and testing!

Original comment by Klaus.We...@gmail.com on 24 Feb 2012 at 6:23

GoogleCodeExporter commented 9 years ago
Bulk update - changing "Fixed" to "Verified" for old bugs.

(Background: I'm changing the "Fixed" status to be considered open, the next 
steps in the lifecycle will be the closed states "FixInTest" and "Verified". 
This lets me mark issues as "Fixed" in commit messages without hiding them from 
the issue tracker.)

Original comment by Klaus.We...@gmail.com on 22 Jan 2013 at 7:33

GoogleCodeExporter commented 9 years ago
Bulk update - changing "Fixed" to "Verified" for old bugs.

Original comment by Klaus.We...@gmail.com on 22 Jan 2013 at 7:34