Closed luben closed 12 years ago
Thanks for reporting the issue! Yes, if you can provide the dictionaries (or a link), that' would be great. I've found some bulgarian dictionaries at http://lasr.cs.ucla.edu/geoff/ispell-dictionaries.html but it'd be nice to have the same version.
I think I've found the bug - one of the affix fields was not copied properly, so the behavior was quite random. I've tested it with the bulgarian dictionaries from the ucla.edu site and the shared dictionary now returns {книга} just like the plain ispell dictionary. Can you check it works for you?
It it works now without problems Thanks a lot
Great work,
I have tried to use it but I have some problems with bulgarian dictionaries from hunspell - they are working with "ispell" template but do not work with "shared_ispell". Here is a transcript from a session that shows the problem:
psql91 (9.1.2) Type "help" for help.
-- create normal ispell dict
db=> DROP TEXT SEARCH DICTIONARY IF EXISTS bulgarian_ispell; DROP TEXT SEARCH DICTIONARY Time: 1,540 ms db=> db=> CREATE TEXT SEARCH DICTIONARY bulgarian_ispell ( db(> TEMPLATE = ispell, db(> DictFile = bg_bg, db(> AffFile = bg_bg, db(> StopWords= bulgarian db(> ); CREATE TEXT SEARCH DICTIONARY Time: 438,533 ms
-- shared ispell dictionary
db=> DROP TEXT SEARCH DICTIONARY IF EXISTS bulgarian_ispell_shared; NOTICE: text search dictionary "bulgarian_ispell_shared" does not exist, skipping DROP TEXT SEARCH DICTIONARY Time: 1,577 ms db=> db=> CREATE TEXT SEARCH DICTIONARY bulgarian_ispell_shared ( db(> TEMPLATE = shared_ispell, db(> DictFile = bg_bg, db(> AffFile = bg_bg, db(> StopWords= bulgarian db(> ); CREATE TEXT SEARCH DICTIONARY Time: 1,908 ms db=> commit; COMMIT Time: 1,372 ms db=> select shared_ispell_reset();
shared_ispell_reset
(1 row)
Time: 124,997 ms
-- tests
db=> SELECT ts_lexize('bulgarian_ispell', 'КНИГИ');
ts_lexize
{книга} (1 row)
Time: 511,633 ms db=> SELECT ts_lexize('bulgarian_ispell_shared', 'КНИГИ');
ts_lexize
(1 row)
Time: 457,093 ms db=> select shared_ispell_mem_used();
shared_ispell_mem_used
(1 row)
-- end
I have tried with russian dictionaries and they work fine, so it is not the cyrillic alphabet to blame.
In postgresql.conf, I have these GUCs related to ispell:
shared_preload_libraries = 'shared_ispell' # (change requires restart) custom_variable_classes = 'shared_ispell' # list of custom variable class names shared_ispell.max_size = 209715200 # 200MB
How to debug the problem? I could send you the dictionaries to try yourself.
Thanks in advance luben