Open cyclomarc opened 9 years ago
I have added example configurations at the README
https://github.com/jprante/elasticsearch-plugin-bundle/blob/master/README.md
It is not much but at least a start.
The decompounder is based on the ASV toolbox
http://wortschatz.uni-leipzig.de/~cbiemann/software/toolbox/#_Baseforms
so it should be somehow possible to ramp up a "training environment" to create new parameter files for decompounding. The binary prebuilt files for german decompounding in the plugin are simply copied from ASV toolbox.
I am looking for an example mapping file that shows how to use the decompounder ? I am familiar with the ES dictionary_decompounder, but if I understand well, the plugin provides a decompounder that does not require a word list. My question is: what is the syntax to be used in the mapping file (filter, analyzer, tokenizer) so that the decompounder is used during analysis ?
Hope you can help Marc