morfologik / morfologik-stemming

Tools for finite state automata construction and dictionary-based morphological dictionaries. Includes Polish stemming dictionary.
BSD 3-Clause "New" or "Revised" License
187 stars 44 forks source link

simple regexp for replacement-pairs #38

Open danielnaber opened 9 years ago

danielnaber commented 9 years ago

I noticed the Danish (and other) hunspell dictionaries have REP statements like these in their .aff file:

REP ^hen hen_ #henover -> hen over
REP ^påny$ på_ny

Morfologik doesn't seem to support ^, $ and _ in its replacement-pairs feature. It would be nice if these could be added so more hunspell dictionaries could be ported to Morfologik without loss of quality in suggestions.

dweiss commented 4 years ago

Daniel is this still an issue?

danielnaber commented 4 years ago

Probably, but I haven't checked recently whether it's really worth working on.

dweiss commented 4 years ago

Ok, let's leave it open then.

ghost commented 3 years ago

It really is an issue for Dutch, because Ij at the start has to be replaces by IJ (in normal words).