opensearch-project / sql

Query your data using familiar SQL or intuitive Piped Processing Language (PPL)
https://opensearch.org/docs/latest/search-plugins/sql/index/
Apache License 2.0
121 stars 140 forks source link

[BUG] ORDER BY on array values produces incorrect results #3133

Open normanj-bitquill opened 4 weeks ago

normanj-bitquill commented 4 weeks ago

What is the bug? When a field that contains array values is used in an ORDER BY clause, the comparisons are performed using only the first element of the array values.

Consider an index with the following data:

{1, [1, 2]}
{2, [3, 4]}
{3, [1, 5]}
{4, [1, 2]}
{5, [2, 3]}

and this query:

SELECT x, y ORDER BY y;
{1, [1, 2]}
{3, [1, 5]}
{4, [1, 2]}
{5, [2, 3]}
{2, [3, 4]}

the expected results are:

{1, [1, 2]}
{4, [1, 2]}
{3, [1, 5]}
{5, [2, 3]}
{2, [3, 4]}

How can one reproduce the bug? Steps to reproduce the behavior:

  1. Create a new index and load the data above
  2. Run the query above on the index

What is the expected behavior? Comparisons are performed on the array values as a whole. An array comparison is performed element by element until a difference is found.

What is your host/environment?

Do you have any screenshots? N/A

Do you have any additional context? Issue #1300 had a change recently merged in that allows array values to be used in query evaluation and in the result set.

normanj-bitquill commented 3 weeks ago

It looks like the OpenSearch engine is doing the sorting, not the SQL plugin.

Using the index and data from the description:

POST test3/_search
{
  "size": 10,
  "sort": {
    "y": {
      "order": "asc"
    }
  }
}

Produces this response:

{
  "took": 3,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 5,
      "relation": "eq"
    },
    "max_score": null,
    "hits": [
      {
        "_index": "test3",
        "_id": "ATXWtZIB72QJDYYHLMzM",
        "_score": null,
        "_source": {
          "x": 1,
          "y": [
            1,
            2
          ]
        },
        "sort": [
          1
        ]
      },
      {
        "_index": "test3",
        "_id": "AzXWtZIB72QJDYYHqczI",
        "_score": null,
        "_source": {
          "x": 3,
          "y": [
            1,
            5
          ]
        },
        "sort": [
          1
        ]
      },
      {
        "_index": "test3",
        "_id": "BDXWtZIB72QJDYYHy8wa",
        "_score": null,
        "_source": {
          "x": 4,
          "y": [
            1,
            2
          ]
        },
        "sort": [
          1
        ]
      },
      {
        "_index": "test3",
        "_id": "Q6i4xJIBxt0sLj_lfLXQ",
        "_score": null,
        "_source": {
          "x": 5,
          "y": [
            2,
            3
          ]
        },
        "sort": [
          2
        ]
      },
      {
        "_index": "test3",
        "_id": "AjXWtZIB72QJDYYHXszf",
        "_score": null,
        "_source": {
          "x": 2,
          "y": [
            3,
            4
          ]
        },
        "sort": [
          3
        ]
      }
    ]
  }
}
andrross commented 1 week ago

[Catch All Triage - 1, 2, 3, 4, 5]