opensearch-project / sql

Query your data using familiar SQL or intuitive Piped Processing Language (PPL)
https://opensearch.org/docs/latest/search-plugins/sql/index/
Apache License 2.0
116 stars 134 forks source link

[BUG] unsupported operator: if when using sql API #2320

Open sudssf opened 11 months ago

sudssf commented 11 months ago

What is the bug? Opensearch version 2.8.0 we are migrating from elasticsearch to opensearch and planning to use nested sql queries to replace runtime field which were supported in elasticsearch sql API

after following documentation from https://opensearch.org/docs/2.0/search-plugins/sql/sql-ppl-api/#query-api

when I try to run sql query using if operator I get following exception { "error": { "reason": "There was internal problem at backend", "details": "unsupported operator: if", "type": "UnsupportedOperationException" }, "status": 503 }

same query works in query workbench. if I use nested query only group by fields are returned.

How can one reproduce the bug? Steps to reproduce the behavior:

  1. call _sql API to run query in jdbc format
  2. post sql query
  3. go to query workbench and verify query works

steps to replicate in dev console (queries works workbench)

POST _bulk
{ "index" : { "_index" : "test_bulk1", "_id" : "1" } }
{ "a": "a", "id": 1 }

GET test_bulk1/_search

POST /_plugins/_sql?format=jdbc
{
  "query": "SELECT IF(id=1, 1, 0 ) as id FROM test_bulk1    group by a",
  "fetch_size" : 10
}

POST /_plugins/_sql?format=jdbc
{
  "query": "SELECT *FROM (SELECT IF(id=1, 1, 0 ) as id FROM test_bulk1 group by a)as tmp",
  "fetch_size" : 10
}

What is the expected behavior? output from query workbench and SQL needs to be consistent

What is your host/environment?

Do you have any screenshots? N/A Do you have any additional context? N/A

sudssf commented 11 months ago

I found the issue since I want to group by data and also transform before group by , I need to move group by in outer nested query. I will test and reopen issue if I get same error

sudssf commented 11 months ago

I found multiple issues and created feature request to add support for scripted field https://github.com/opensearch-project/sql/issues/2325 . any guidance is appreciated. following is example which does not work

PUT /test_bulk3
{
  "mappings": {
    "properties": {
      "s_label": {
        "type": "keyword"
      },
      "seq":{
        "type": "integer"
      }
    }
  }
}

POST test_bulk3/_bulk
{ "index":{} }
{ "s_label": "a", "seq": 1 }
{ "index":{} }
{ "s_label": "b", "seq": 2 }

POST /_plugins/_sql?format=jdbc
{
  "query": "SELECT * FROM test_bulk3",
  "fetch_size" : 10
}

POST /_plugins/_sql?format=jdbc
{
  "query": "SELECT  IF(s_label = 'a', 1, 0), seq FROM test_bulk3",
  "fetch_size" : 10
}
sudssf commented 11 months ago

looks like if I remove fetch_size parameter I get the results, but no cursor is present if I add fetch_size parameter I get error similar to following

\n  \"error\": {\n    \"reason\": \"Invalid SQL query\",\n    \"details\": \"could not parse sqlBinaryOpExpr need to be identifier/valuable gotclass com.alibaba.druid.sql.ast.expr.SQLBinaryOpExpr with value:...  \"type\": \"SqlParseException\"\n  },\n  \"status\": 400\n}"

if fetch_size is the issue , I wonder if I can paginate using OFFSET,LIMIT at the end of the query?

sudssf commented 11 months ago

adding @Yury-Fridlyand for help as I see there PR created for pagination related changes https://github.com/opensearch-project/sql/pull/1752

Yuri do you know what is limitation of V1 LIMIT OFFSET,PAGE_SIZE approach for pagination?

background: we are migrating from ES 7.17 to Opensearch 2.8.0+ and facing issues with some of the queries which use pagination and runtime fields. I am trying to use nested SQL queries to overcome runtime fields

LantaoJin commented 3 months ago

The failed query in https://github.com/opensearch-project/sql/issues/2320#issue-1948592886 has 2 problems.

  1. SELECT IF(id=1, 1, 0 ) as id FROM test_bulk1 group by a is an invalid query which is missing aggregation function. The non-aggregating expression "id" is based on columns which are not participating in the GROUP BY clause. An valid aggregation query, for example, could be SELECT max(IF(id=1, 1, 0)) as max_id FROM test_bulk1 group by a. Invalid query should throw AnalysisException (https://github.com/opensearch-project/sql/issues/2764), and java.lang.UnsupportedOperationException: unsupported operator: if won't be thrown.
  2. Adding "fetch_size" in the query will trigger Cursor operation, but the query with group by as an aggregation statement hasn't supported PaginatedPlan yet. So it fallback to legacy engine which caused different behaviours. Supporting PaginatedPlan with aggregation could be discussed.