Closed ilanrivers closed 6 years ago
I'm afraid I don't understand what you're asking. You want to expand on your question, perhaps with JSON examples?
I will try to give a clear example: Lets say you fill an index with 3 words:
POST /test_es_suggest/type
{
"word": "Software"
}
POST /test_es_suggest/type
{
"word": "Developer"
}
POST /test_es_suggest/type
{
"word": "Programmer"
}
Now i use the Suggest API as follows:
GET /test_es_suggest/_suggest
{
"Suggest": {
"term": {
"field": "word",
"suggest_mode": "popular",
"size": 2,
"prefix_len": 1,
"analyzer": "default"
},
"text": "Softwarer"
}
}
I receive the following results which is fine because software was spelled bad now it is corrected The options is filled with the new word:
"Suggest": [
{
"text": "softwarer",
"offset": 0,
"length": 9,
"options": [
{
"text": "software",
"score": 0.875,
"freq": 1
}
]
}
]
Now if i search for the correct word:
GET /test_es_suggest/_suggest
{
"Suggest": {
"term": {
"field": "word",
"suggest_mode": "popular",
"size": 2,
"prefix_len": 1,
"analyzer": "default"
},
"text": "Software"
}
}
The results is as follows:
"Suggest": [
{
"text": "software",
"offset": 0,
"length": 8,
"options": []
}
]
Options is empty so I assume there is nothing to be suggested OR was the word just not found?
If i search for "softwareasdfasdf" i get the exact same response which i know i get that response because there is no suitable word to be suggested.
"Suggest": [
{
"text": "softwareasdfasdf",
"offset": 0,
"length": 16,
"options": []
}
]
In the example with "Software" I would like to know that the word was correct so there is nothing to be suggested. In my opinion there is a clear difference in the reason both responses show an empty "options" array. One word cannot be found and the other word is already correct.
So my question is, can something be added to the response so I know the difference?
What about just setting the suggest_mode
to always
?
See https://www.elastic.co/guide/en/elasticsearch/reference/current/search-suggesters-term.html
According to the documentation: Suggest mode Always : Suggest any matching suggestions based on terms in the suggest text.
But if I use suggest mode always I still do not know if the word was found or not and I still receive an empty options array by just changing the mode to always so it seems this does not guarantee that you receive a suggestion.
GET /test_es_suggest/_suggest
{
"Suggest": {
"term": {
"field": "word",
"suggest_mode": "always",
"size": 2,
"prefix_len": 1,
"analyzer": "default"
},
"text": "softwaredev"
}
}
Was this question clear now? @clintongormley
Is this issue going to be done? It seems as if a field can be added to indicate if the search term is an exact match with an word in the index and that solves the problem.
Indeed. It would be very useful to know why no suggestion was returned exactly. :+1:
It would be very helpful for me too, to see the difference between a term that is correctly spelled and a term where is no suggestion for!
Might be simplifying here but my experience is that the term suggester and phrase suggester can be valuable with little custom effort. They are good at suggesting corrections so if the data has 'virtualize' and I type 'virtualise' I'll get a useful suggestion. The suggest_mode 'always' doesn't seem to work as I'd expect so when a user types a term that is good the term suggester can't offer feedback.
In the process of working around this by extracting interesting terms from main index and generating a special index for use with a completion suggester. This is going pretty well but seems strange that the term/phrase suggesters can do the harder job of helping suggest corrections but can't provide results that say yep 'virtuali' is a term that can be completed with 'virtualize'.
The term
and phrase
suggester are not returning corrections based on the absence of a term, it uses the statistics of the index to choose whether a term should be corrected or not. The fact that a term appears once in a single shard is not an indication that the term is correctly spelled. There are ways to improve the precision of these suggesters, for instance the real_word_error_likelihood
can be set to 1 in order to indicate that all terms in the dictionary are correctly spelled. I'd not advise to do that if you're indexing random pages on the web but this can be a solution if you don't want to return suggestions for terms that already appear in the dictionary.
I am going to close this issue because the goal of these suggesters is not to determine if a term (or phrase) is present in the index but rather try to find a better query that would statistically be more relevant for the targeted corpus.
When suggestions are found you see these in the options of the results But when a term is spelled correctly you also see no options filled in.
So there is no way to really know if the suggester was not able to find a matching suggestion or if the term was spelled correctly to begin with.