gamag / ka_GE.spell

ქართული ორთოგრაფიული ლექსიკონი - Georgian Spell Checking Dictionary
GNU General Public License v3.0
32 stars 2 forks source link
dictionary firefox georgia georgian hunspell openoffice spell-checker spell-checker-plugin spelling spelling-checker spelling-correction

ქართული ორთოგრაფიული ლექსიკონი - Georgian Spell Checking Dictionary

Contains:

Note: The used word lists have been automatically created by crawling the internet using different techniques, so many words may be missing or wrong (which leads to false positives/negatives)

Dictionary installation

In Applications

System wide

Mac OS X (10.6 and later)

Linux

Copy dictionaries/ka_GE.dic and dictionaries/ka_GE.aff to /usr/share/hunspell/

Data sources

Word lists by the following People / from the following sources are used to generate the dictionary:

Thanks a lot for your awesome work!

Update/build dictionary

You need a bash compatible shell, gnu tools, hunspell (and hunspell-tools on some systems) and a c++14 compatible compiler installed. xmunch (https://github.com/gamag/xmunch) is as submodule, so after cloning this repository, run git submodule update --init, then go to xmunch subdirectory and run make.

To build the dictionary, run make all

To build the packages for firefox and OpenOffice, run make bundle later.

Updating Bumbeishvilis word list

NOTE: the word list is included in words/, you don't need this steps to work on the dictionary.

You need a running mysql server and git.

Clone this repository

Log into mysql and add a user and a database for the word list:

$ mysql -uroot -p
mysql > CREATE DATABASE geoword;
mysql > CREATE USER 'geowords'@'localhost' IDENTIFIED BY 'password';
mysql > CREATE USER 'geowords'@'localhost' IDENTIFIED BY 'password';
mysql > GRANT ALL PRIVILEGES ON geowords.\* TO 'geowords'@'localhost'; 
mysql > FLUSH PRIVILEGES;

Create a file called dbaccess in the ka_GE.spell root. containing:

DBNAME=geowords
DBUSER=geowords
DBPASS=password

call make db

Remarks

The automatically created dictionary is not very accurate, some words may be wrong, many missing. To improve that, words from the dictionary can be reviewed and correct words added to the reviewed dictionary in their final, affix-compressed form. Wrong words can be added to blacklist.

Contributing

Any help is very welcome, especially reviewing the dictionary and improving the affix files.

TODO: translate README to Georgian.