jprante / elasticsearch-plugin-bundle

A bundle of useful Elasticsearch plugins
GNU Affero General Public License v3.0
110 stars 17 forks source link

How to use the decomposer ? #3

Open cyclomarc opened 9 years ago

cyclomarc commented 9 years ago

I am looking for an example mapping file that shows how to use the decompounder ? I am familiar with the ES dictionary_decompounder, but if I understand well, the plugin provides a decompounder that does not require a word list. My question is: what is the syntax to be used in the mapping file (filter, analyzer, tokenizer) so that the decompounder is used during analysis ?

Hope you can help Marc

jprante commented 9 years ago

I have added example configurations at the README

https://github.com/jprante/elasticsearch-plugin-bundle/blob/master/README.md

It is not much but at least a start.

The decompounder is based on the ASV toolbox

http://wortschatz.uni-leipzig.de/~cbiemann/software/toolbox/#_Baseforms

so it should be somehow possible to ramp up a "training environment" to create new parameter files for decompounding. The binary prebuilt files for german decompounding in the plugin are simply copied from ASV toolbox.