opensearch-project / opensearch-java

Java Client for OpenSearch
Apache License 2.0
124 stars 184 forks source link

Porter Stem Filter not working in some cases #954

Open nileshshroff opened 6 months ago

nileshshroff commented 6 months ago

What is the bug?

In some cases the stemmed search is not returning any results.

In case of the following text which is indexed using porter_stem "price increases daily for the last month or so and this is not even the breaking news"

Searching for "price increas daili" returns 0 hits Searching for "break new" returns 1 hit

The following steps show how it can be reproduced.

How can one reproduce the bug?

Create a index with a field which uses a porter_stem filter

PUT /index_with_stemissue { "settings": { "index": { "number_of_shards": 1, "number_of_replicas": 1 }, "analysis": { "analyzer": { "stemanalyzer": { "type": "custom", "tokenizer": "letter", "filter": [ "lowercase", "porter_stem" ] } } } }, "mappings": { "properties": { "message_text": { "type": "text", "analyzer": "stemanalyzer", "store": true } } } }

Test the field with analyze to see the stem words

GET /index_with_stemissue/_analyze { "field": "message_text", "text": "price increases daily for the last month or so and this is not even the breaking news" }

Add the document to the index

POST /index_with_stemissue/_doc/ { "message_text" : "price increases daily for the last month or so and this is not even the breaking news" }

This search using query string where stem words are included not work NO RESULTS RETURNED

GET index_with_stemissue/_search { "query": { "bool" : { "must" : [ { "query_string": { "query" : "message_text:\"price increas daili\"" } } ] } } }

However, another eg stem word works fine

GET index_with_stemissue/_search { "query": { "bool" : { "must" : [ { "query_string": { "query" : "message_text:\"break new\"" } } ] } } }

What is the expected behavior?

When searching for "price increas daili" 1 results should be returned

What is your host/environment?

AWS Service OpenSearch version 1.1

Do you have any screenshots?

If applicable, add screenshots to help explain your problem.

Do you have any additional context?

Add any other context about the problem.

wbeckler commented 6 months ago

Is this due to an issue with the opensearch-java client, or should this issue be moved to https://github.com/opensearch-project/OpenSearch

dblock commented 6 months ago

@nileshshroff Can you reproduce this with curl? Which version of OpenSearch/client? Otherwise can you please post your java code.