opensearch-project / sql

Query your data using familiar SQL or intuitive Piped Processing Language (PPL)
https://opensearch.org/docs/latest/search-plugins/sql/index/
Apache License 2.0
120 stars 139 forks source link

Sql Like statement is not working with numbers and special characters #2032

Open HabooshHaddad opened 1 year ago

HabooshHaddad commented 1 year ago

What is the bug? sql jdbc is not working with Like operator when the value contains numbers or special characters.

How can one reproduce the bug? Steps to reproduce the behavior:

  1. Create an index with field of type text
  2. Fill the field with json data for example and make sure the values contains a mix of numbers and special characters ( ex. Jordan1234{a:a} )
  3. Run the below sql statement
    select id  
    from index_name
    where data LIKE 'Jordan1%'
  4. Notice that no data is retrieved
  5. In case I run the below statement
    select id  
    from index_name
    where data LIKE 'Jordan%'
  6. I will be getting the desired result and the item is retrieved

What is the expected behavior? When running the first sql statament, data should be retrieved

Yury-Fridlyand commented 1 year ago

Hi What is the mapping for the field you're using in the LIKE statement? How big it that index?

HabooshHaddad commented 1 year ago

Hello Yury, The field mapping is a "Text", and the index is 10 Million records

HabooshHaddad commented 1 year ago

Here is a clearer example I created the below index :

PUT sql_like
{
  "settings": {
    "analysis": {
      "analyzer": {
        "data_analyzer": {
          "type": "pattern",
          "pattern": "\\W+|(?<=\\D)(?=\\d)|(?<=\\d)(?=\\D)",
          "lowercase": true
        }
      }
    }
  },
  "mappings": {
    "properties": {
      "field1": {
        "type": "text",
        "analyzer": "data_analyzer"
      }
    }
  }
}

POST sql_like/_doc
{
    "field1": "test1 test2 test3"
}

The below sql will not retrieve data

select id  
from index_name
where data LIKE 'test1%'

whille this sql will retrieve the item

select id  
from index_name
where data LIKE 'test%'

When adding an index without the data analyzer both queries works well, but I need the analyzer.

Yury-Fridlyand commented 1 year ago

Thanks for sharing infor @HabooshHaddad. I'm moving the bug to the right place.

chemeng commented 5 months ago

Are there any updates on this one?

LantaoJin commented 4 months ago

This is not an issue of SQL plugin. It doesn't work via DSL either:

GET sql_like/_search
{
  "query": {
    "wildcard": {
      "field1": {
        "value": "test1*",
        "case_insensitive": true
      }
    }
  }
}

But I think it is relevant to the pattern your analyzer set. If we change \\W+ to \\W*, select id from index_name where data LIKE 'test%' won't work either. So I think the issue could be moved to OpenSearch Core.