eclipse / jnosql

Eclipse JNoSQL is a framework which has the goal to help Java developers to create Jakarta EE applications with NoSQL.
Other
231 stars 72 forks source link

[BUG] Elasticsearch's term query may return no results when searching text fields #386

Closed dearrudam closed 1 year ago

dearrudam commented 1 year ago

Which JNoSQL project the issue refers to?

JNoSQL Databases

Bug description

By default, Elasticsearch changes the values of text fields as part of analysis. This can make finding exact matches for text field values difficult. More info here: https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-term-query.html.

Today, any JNoSQL EQUALS query conditions will be converted to the Elasticsearch term query condition, making queries on text fields by exact value does not work as expected.

In order to solve this situation it's needed to make sure that the target fields got the keyword format.

If the index needs to be created yet, we could provide a JSON file named with the target index name into the classpath. This file will be used by JNoSQL implementation to configure the target index. Whole mapping information should be provided in this file.

If it's intending to use a created index already, it's necessary to make sure that this index mapping is in the correct way, otherwise, it's needed to manage this index mapping info by using the Elasticsearch API by itself.

The cheaper solution, for now, is to enhance the documentation with these details in order to help developers that want to work with JNoSQL and Elasticsearch together;

But, in order to avoid mistakes in this situation and give better support about it, maybe the JNoSQL implementation could get some improvement in order to figure out if the field that the user is trying to use in an EQUALS condition that supports the Elasticsearch term query. With that, some proposal idea rises:

Proposal

Once JNoSQL detects that the target field doesn't support the Elasticsearch term query condition then the EQUALS query condition will be converted to the Elasticsearch match query condition. Otherwise, the Elasticsearch term query condition still gonna be used as default.

Pros

Cons

Does anyone have thoughts about it?

JNoSQL Version

1.0.0-SNAPSHOT

Steps To Reproduce

  1. Use the elasticsearch module from the branch issue-365 of the forked repo: https://github.com/dearrudam/jnosql-demos-se;

  2. Delete the src/main/resources/developers.json file;

  3. Initilize an Elastichsearch instance by this docker command:

    docker run -p 9200:9200 -p 9300:9300 \
    -e "ES_JAVA_OPTS=-Xms1g -Xmx1g" \
    -e "xpack.security.enabled=false" \
    -e "discovery.type=single-node" \
    elasticsearch:8.7.1
  4. Execute the src/main/java/org/jnosql/demo/se/App2.java. You'll that the people list results from the repository.findByName("Maria Lovelace") is empty. Here is the strange behavior.

  5. To confirm the issue, delete the database by performing the following command:

    curl -X DELETE http://localhost:9200/developers
  6. Add again the src/main/resources/developers.json with the required mappings info:

    {
    "mappings": {
    "properties": {
      "@entity": {
        "type": "keyword"
      },
      "names": {
        "type": "keyword"
      }
    }
    }
    }
  7. Perform again the src/main/java/org/jnosql/demo/se/App2.java. You'll that the people list results from the repository.findByName("Maria Lovelace") is not longer empty, bringing the expected record;

Expected Results

Make the API follow the Elasticsearch recommendation avoiding to use term condition for the text fields. It will make the API behave pretty close to the expected one most of the cases;

Code example, screenshot, or link to a repository

No response

otaviojava commented 1 year ago

That is nice @dearrudam will work on this one?

dearrudam commented 1 year ago

@otaviojava yes!!! I'll work on it!