opensearch-project / sql

Query your data using familiar SQL or intuitive Piped Processing Language (PPL)
https://opensearch.org/docs/latest/search-plugins/sql/index/
Apache License 2.0
121 stars 140 forks source link

[BUG] Query with UNION operator fails with NullPointerException #2540

Open StewartWBrown1 opened 9 months ago

StewartWBrown1 commented 9 months ago

What is the bug? Present in OpenSearch 2.12.0, 2.11.0, 1.3.10 The documentation states that Union queries are supported. However, when attempting a simple union query, the request fails with:

{ "error": { "reason": "There was internal problem at backend", "details": "", "type": "NullPointerException" }, "status": 500 }

The query returns the expected result when ?format=csv is added to the request URL.

How can one reproduce the bug? Steps to reproduce the behavior:

  1. Index data into OpenSearch (for this example, I am using simple novels/song data)
  2. perform query involving union: { "query": "SELECT title FROM novels UNION SELECT title FROM songs" }
  3. NullPointerException error returned

What is the expected behavior? Union operator to work as expected. In above example adding the request call to be ''http://localhost:9200/_plugins/_sql?format=csv" returns:

title
Animal Farm
The Great Gatsby
Catcher In The Rye
Bohemian Rhapsody

What is your host/environment? Present in OpenSearch 2.12, 2.11 and 1.3.10

Do you have any screenshots? Error: image

As expected (When adding ?format=csv to request URL): image

Do you have any additional context? Add any other context about the problem.

penghuo commented 8 months ago

could u share the index mapping.

PUT index001/_doc/1
{
  "query_id" : "1"
}

PUT index002/_doc/1
{
  "query_id" : "1"
}

POST /_plugins/_sql?format=csv
{
  "query": "select query_id from index001 UNION select query_id from index002"
}

### result
query_id
1
1
StewartWBrown1 commented 8 months ago

@penghuo error occurs for me following same mapping you have used, but without the '?format=csv' query parameter at the end of the POST call, but my understanding is this shouldn't be necessary and a response should be returned without it as expected.

$ curl -X PUT -k -s -H "Content-Type: application/json" http://localhost:9200/index003/_doc/1 -u stxbro:$PASSWORD -d $'{"query_id":"1"}'
{"_index":"index003","_id":"1","_version":1,"result":"created","_shards":{"total":2,"successful":1,"failed":0},"_seq_no":0,"_primary_term":1}
$ curl -X PUT -k -s -H "Content-Type: application/json" http://localhost:9200/index004/_doc/1 -u stxbro:$PASSWORD -d $'{"query_id":"2"}'
{"_index":"index004","_id":"1","_version":1,"result":"created","_shards":{"total":2,"successful":1,"failed":0},"_seq_no":0,"_primary_term":1}
$ curl -X POST -k -s -H "Content-Type: application/json" http://localhost:9200/_plugins/_sql -u stxbro:$PASSWORD -d $'{"query": "select query_id from index003 UNION select query_id from index004"}'
{
  "error": {
    "reason": "There was internal problem at backend",
    "details": "",
    "type": "NullPointerException"
  },
  "status": 500
}

However, when adding the '?format=csv' query string a response is returned as expected:

$ curl -X POST -k -s -H "Content-Type: application/json" http://localhost:9200/_plugins/_sql?format=csv -u stxbro:$PASSWORD -d $'{"query": "select query_id from index003 UNION select query_id from index004"}'
query_id
1
2
StewartWBrown1 commented 8 months ago

Just to clarify, I see this only seems to be an issue with the default format (JDBC), and with 'raw' format.

A response is returned when using the 'format query parameter' to state the response format should be JSON or csv.

LantaoJin commented 5 months ago

Currently, the UNION statement is implemented in legacy engine(v1). It's a bug in legacy engine(v1). I am going to fix this issue in legacy engine, although UNION statement should be supported in new engine(v2) from long term perspective. As a fundamental syntax, this is worth fixing as quick as possible.