sbbic / khmer-dictionary-tools

Automatically exported from code.google.com/p/khmer-dictionary-tools
0 stars 0 forks source link

Help to improve the spellchecker #1

Open GoogleCodeExporter opened 8 years ago

GoogleCodeExporter commented 8 years ago
Currently the most updated Khmer spellchecker is the one from the KhmerOS
http://www.khmeros.info/drupal/?q=en/node/1941

However in 2006 somebody also released another OpenOffice Khmer spellchecker.
http://www.sbbic.org/en/khmer-spelling-checker

Can we compare both and get a final one with both best features?

Original issue reported on code.google.com by capiscuas@gmail.com on 7 Jun 2008 at 6:20

GoogleCodeExporter commented 8 years ago
[deleted comment]
GoogleCodeExporter commented 8 years ago
The KhmerOS spellchecker currently only contains a .bat installer for Windows. 
It
should come with a Language Pack to be installed using OpenOffice Dicooo.

For Ooo 3.x we should package it as an OXT extension.

Original comment by capiscuas@gmail.com on 7 Jun 2008 at 6:30

GoogleCodeExporter commented 8 years ago
The Nathan Wells for SBBIC spellchecker (the one inactive since 2006) contains a
total of 59602 words(from KhmerOS + www.pancambodia.org)  and this is its 
CHANGELOG:

Version History:

v.05 - August-31-2006 - 52550 words (all manually checked)
v.04 - April-10-2006 - 85291 words (imported by computer with little manual 
checking)
v.03 - March-5-2006 - 23396 words and fixed an encoding bug that caused some 
words to
be spelling incorrectly
v.02  - March-3-2006 - 19375 words in a standard spelling dictionary
v.01 - March-2-2006 - 19375 words in 10 user defined spelling dictionaries

Author: Nathan Wells , sungkhum _AT_ gmail.com
Links: http://www.sbbic.org
word list from http://sourceforge.net/projects/khspell/

Original comment by capiscuas@gmail.com on 7 Jun 2008 at 6:36

GoogleCodeExporter commented 8 years ago
It seems to be a 3rd spellchecker, this time not Hunspell but based on Hidden 
Markov
Model, here is the PDF document:

http://downloads.sourceforge.net/khspell/khspell-thesis.pdf?modtime=1139219426&b
ig_mirror=0

The spellchecker is developed in Python and it can be found here (inactive 
since 2006):
http://sourceforge.net/project/showfiles.php?group_id=158692

Original comment by capiscuas@gmail.com on 7 Jun 2008 at 7:03