NLPchina / elasticsearch-sql

Use SQL to query Elasticsearch
Apache License 2.0
7k stars 1.54k forks source link

SELECT with LIKE converted _ symbol into ? #885

Open tigran-m-dev opened 5 years ago

tigran-m-dev commented 5 years ago

Query request

SELECT filed FROM index  WHERE field_2 LIKE 'a_b'

Explain responce

{
    "from": 0,
    "size": 1000,
    "query": {
        "bool": {
            "filter": [
                {
                    "bool": {
                        "must": [
                            {
                                "wildcard": {
                                    "field_2": {
                                        "wildcard": "a?b",
                                        "boost": 1
                                    }
                                }
                            }
                        ],
                        "adjust_pure_negative": true,
                        "boost": 1
                    }
                }
            ],
            "adjust_pure_negative": true,
            "boost": 1
        }
    },
    "_source": {
        "includes": [
            "filed"
        ],
        "excludes": []
    }
}

if I try to equal

SELECT filed FROM index  WHERE field_2 = 'a_b'

Explain output is ok

{
    "from": 0,
    "size": 1000,
    "query": {
        "bool": {
            "filter": [
                {
                    "bool": {
                        "must": [
                            {
                                "match_phrase": {
                                    "field_2": {
                                        "query": "a_b",
                                        "slop": 0,
                                        "zero_terms_query": "NONE",
                                        "boost": 1
                                    }
                                }
                            }
                        ],
                        "adjust_pure_negative": true,
                        "boost": 1
                    }
                }
            ],
            "adjust_pure_negative": true,
            "boost": 1
        }
    },
    "_source": {
        "includes": [
            "filed"
        ],
        "excludes": []
    }
}
shi-yuan commented 5 years ago

what‘s problem?

tigran-m-dev commented 5 years ago

@shi-yuan Here is the problem. When I am doing select like this

SELECT filed FROM index  WHERE field_2 LIKE 'a_b%'

It's converted to a?b* which means any symbol and not _ so it is converetd _ symbold into ? But in other symbols for example - it is ok if my query is

SELECT filed FROM index  WHERE field_2 LIKE 'a-b%'

It is converted to a-b*. So when you have a time please have a look why _ changed to ? when query with LIKE search Thanks

shi-yuan commented 5 years ago
SELECT filed FROM index  WHERE field_2 LIKE 'a&UNDERSCOREb%'
tigran-m-dev commented 5 years ago

Thank you, it is working.

Can you please help one more thing? How can we escape * and ? symbols? I tried standard backslash, but it is not working. LIKE a\\*b and same way LIKE a\\?b but it is not working and show all results between a and b

shi-yuan commented 5 years ago

you should use query_string:

select filed from index where q=query('field_2:a\\?b\\*')
tigran-m-dev commented 5 years ago

It is not working. Here is example. You have 2 doc

  1. asd
  2. a*d

your search query is

select filed from index where q=query('field_2:a\\*d')

This query result output all 2 docs and not only second one

shi-yuan commented 5 years ago

I test latest version,it's ok,just return a*d

tigran-m-dev commented 5 years ago

I am using elastic search version 6.6.0, plugin version current master (commit c430207) in doc have data

  1. asd
  2. a*d

executing query

select filed from index where q=query('filed:a\\*d')

response object

{
    "took": 285,
    "timed_out": false,
    "_shards": {
        "total": 4,
        "successful": 4,
        "skipped": 0,
        "failed": 0
    },
    "hits": {
        "total": 2,
        "max_score": 0,
        "hits": [
            {
                "_index": "index",
                "_type": "_doc",
                "_id": "test_doc_1",
                "_score": 0,
                "_source": {
                    "filed": "a*d"
                }
            },
            {
                "_index": "index",
                "_type": "_doc",
                "_id": "test_doc_2",
                "_score": 0,
                "_source": {
                    "filed": "asd"
                }
            }
        ]
    }
}

But if it works it should return only data a*d

shi-yuan commented 5 years ago

My explain:

{
  "from": 0,
  "size": 1000,
  "query": {
    "bool": {
      "filter": [
        {
          "bool": {
            "must": [
              {
                "query_string": {
                  "query": "filed:a\\*d",
                  "fields": [],
                  "type": "best_fields",
                  "default_operator": "or",
                  "max_determinized_states": 10000,
                  "enable_position_increments": true,
                  "fuzziness": "AUTO",
                  "fuzzy_prefix_length": 0,
                  "fuzzy_max_expansions": 50,
                  "phrase_slop": 0,
                  "escape": false,
                  "auto_generate_synonyms_phrase_query": true,
                  "fuzzy_transpositions": true,
                  "boost": 1
                }
              }
            ],
            "adjust_pure_negative": true,
            "boost": 1
          }
        }
      ],
      "adjust_pure_negative": true,
      "boost": 1
    }
  },
  "_source": {
    "includes": [
      "filed"
    ],
    "excludes": []
  }
}
shi-yuan commented 5 years ago

What's yours?

lebronyang1 commented 4 years ago

我的内容jsoncontent:{"name":"test","page":88,"isNonProfit":true} 我想搜索:name是test; 要如何写? jsoncontent 是一个字符串filed,里面是压缩后的json字符串

shi-yuan commented 4 years ago

jsoncontent是keywor还是text的?