skroutz / elasticsearch-skroutz-greekstemmer

Greek Stemmer for elasticsearch
https://www.skroutz.gr
74 stars 34 forks source link

ElasticsearchIllegalArgumentException[failed to find token filter type [skroutz_stem_greek] for [stem_greek]]; #2

Closed mbouclas closed 10 years ago

mbouclas commented 10 years ago

Version 1.1, Index :

    "index":{
        "analysis":{
            "analyzer":{
                "analyzer_startswith":{
                    "tokenizer":"keyword",
                    "filter":"lowercase"
                },
                "prefix-test-analyzer": {
                    "type": "custom",
                    "tokenizer": "standard",
                    "filter" : ["lowercase","stem_greek"]
                }
            },
            "filter" : {
                "mynGram" : {
                    "type" : "nGram",
                    "min_gram" : 2,
                    "max_gram" : 50
                },
                "stem_greek": {
                    "type":"skroutz_stem_greek"
                }
            },
            "tokenizer": {
                "prefix-test-tokenizer": {
                    "type": "path_hierarchy",
                    "delimiter": "."
                }
            }
        }
    }
mbouclas commented 10 years ago

ignore it, ES needs restart after plugin installation

chief commented 10 years ago

ok thanks

mbouclas commented 10 years ago

if the plugin works, would it work with accents? For example search for φρα would it be equal to φρά? Cause in my searched it doesn't work.

Thanks

chief commented 10 years ago

Input is expected to to be casefolded for Greek (including folding of final sigma to sigma), and with diacritics removed. This can be achieved with GreekLowerCaseFilter.

mbouclas commented 10 years ago

are you refering to an ES filter or JAVA class?

chief commented 10 years ago

ES filter

chief commented 10 years ago

Please have in mind that we do not test our plugins with new elasticsearch (>1.0.0) (currently we support up to 0.90.3).

mbouclas commented 10 years ago

so what you mean is something like "filter" : ["myGreekLowerCaseFilter","stem_greek"] where myGreekLowerCaseFilter is "myGreekLowerCaseFilter": { "type": "lowercase", "language" :"greek" }

chief commented 10 years ago

seems good to me.

mbouclas commented 10 years ago

Thanks a lot, it works fine now