Open ThaDafinser opened 7 years ago
For auto_phrase i found so far (could not get it working)
GET _analyze
{
"tokenizer": "standard",
"filter": [
{
"type": "auto_phrase",
"phrases": [
"C:/Data/test.txt"
]
}
],
"text": "what is my income tax refund this year now that my property tax is so high"
}
https://github.com/jprante/elasticsearch-plugin-bundle/blob/68dc19c34c40364e04400f92500b973a6cbae170/src/main/java/org/xbib/elasticsearch/index/analysis/autophrase/AutoPhrasingTokenFilterFactory.java
Hi,
In addition to the original issue, LemmatizeTokenFilter lacks description too. I would appreciate any info on how to configure it, on supported languages and what is behind this plugin.
To me this plugin looks similar to baseform plugin. From skimming through the code I can guess that the lemmatizer replaces the original word while baseform-er adds generated form alongside the original.
Thanx
@nkrot in general you gave the answer.
I updated the gist with an example. Like you said, it just keeps the baseform and removes the original word
GET _analyze
{
"tokenizer": "standard",
"filter": [
{
"type": "lemmatize",
"language": "de"
}
],
"text": "Ich gehe gerne mit meinen neuen Schuhen"
}
@ThaDafinser , thank you. Do you have any info on
thanx,
Sadly not yet.
You can see a lot of examples in the tests, how it should work. https://github.com/jprante/elasticsearch-plugin-bundle/blob/93ed7cb33b9c8095c279405467d4301422324655/src/test/java/org/xbib/elasticsearch/index/analysis/lemmatize/LemmatizeTokenFilterTests.java#L113
LemmatizeTokenFilter
is still work in progress, in experimental stage. It is considered as an alternative to a synonym token filter https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-synonym-tokenfilter.html
but based on a language-specific dictionary of known compound words.
After going through a lot of examples, code and so on...
I think the best would be to create something like this https://github.com/ThaDafinser/elasticsearch-plugin-bundle/blob/feature/doc/docs/index.md
For a "one pager" (or add all in Readme) there are too many things to explain, and with such an approach the documentation can be created step by step.
Like mentioned at the end, it's similar to the ES reference guide structure https://www.elastic.co/guide/en/elasticsearch/reference/5.3/index.html
@jprante what do you think? If you like it, i will add some more pages and create a PR for this one.
Hello,
i tried now to complete the examples for Kibana, see https://gist.github.com/ThaDafinser/d27b4fa9d144b0083ee7dad37484fdd8
For the example i've gone through the complete plugin-list https://github.com/jprante/elasticsearch-plugin-bundle#a-plugin-bundle-for-elastisearch
For those plugins i couldn't find docs ( @jprante could cou help me here pls?)
Other missing examples for now (could not create a "live" example yet)
_analyze
API foricu_collation
Are there any other things missing? When they are finished: Do you want them in README or in a seperate file?