opensearch-project / sql

Query your data using familiar SQL or intuitive Piped Processing Language (PPL)
https://opensearch.org/docs/latest/search-plugins/sql/index/
Apache License 2.0
116 stars 134 forks source link

[BUG] Runtime exception thrown if duplicate name in SELECT items #785

Open dai-chen opened 2 years ago

dai-chen commented 2 years ago

What is the bug? Currently duplicate name in SELECT statement is not supported and runtime exception is thrown at execution time.

How can one reproduce the bug? Steps to reproduce the behavior:

  1. Go to Query Workbench or DevTools
  2. Find any index
  3. Run query with duplicate name in SELECT
  4. See error as below
curl -XPOST "localhost:9200/_plugins/_sql" -H 'Content-Type: application/json' -d '
{
  "query": "SELECT region, region FROM testmv "
}
'
{
  "error": {
    "type": "IllegalArgumentException",
    "reason": "There was internal problem at backend",
    "details": "Multiple entries with same key: region=\"CA\" and region=\"CA\""
  },
  "status": 503
}

What is the expected behavior? Either semantic check performed and semantic exception thrown with clear error message Or support this case as other database if needed.

What is your host/environment?

Do you have any screenshots? N/A

Do you have any additional context? Tested MySQL and SQLServer. Both has same behavior as below:

CREATE TABLE Test(name VARCHAR(100));
INSERT INTO Test(name) VALUES('hello');

SELECT name, name FROM Test;

name | name
-- | --
hello | hello

Exception stacktrace:

Server side error during query execution
java.lang.IllegalArgumentException: Multiple entries with same key: region="CA" and region="CA"
    at com.google.common.collect.ImmutableMap.conflictException(ImmutableMap.java:376) ~[guava-31.0.1-jre.jar:?]
    at com.google.common.collect.ImmutableMap.checkNoConflict(ImmutableMap.java:370) ~[guava-31.0.1-jre.jar:?]
    at com.google.common.collect.RegularImmutableMap.checkNoConflictInKeyBucket(RegularImmutableMap.java:153) ~[guava-31.0.1-jre.jar:?]
    at com.google.common.collect.RegularImmutableMap.fromEntryArray(RegularImmutableMap.java:115) ~[guava-31.0.1-jre.jar:?]
    at com.google.common.collect.ImmutableMap$Builder.buildOrThrow(ImmutableMap.java:574) ~[guava-31.0.1-jre.jar:?]
    at com.google.common.collect.ImmutableMap$Builder.build(ImmutableMap.java:538) ~[guava-31.0.1-jre.jar:?]
    at org.opensearch.sql.planner.physical.ProjectOperator.next(ProjectOperator.java:82) ~[core-2.2.0.0-SNAPSHOT.jar:?]
    at org.opensearch.sql.planner.physical.ProjectOperator.next(ProjectOperator.java:28) ~[core-2.2.0.0-SNAPSHOT.jar:?]
    at org.opensearch.sql.opensearch.executor.OpenSearchExecutionEngine.lambda$execute$0(OpenSearchExecutionEngine.java:40) [opensearch-2.2.0.0-SNAPSHOT.jar:?]
    at org.opensearch.sql.opensearch.client.OpenSearchNodeClient.lambda$withCurrentContext$5(OpenSearchNodeClient.java:197) [opensearch-2.2.0.0-SNAPSHOT.jar:?]
    at org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:747) [opensearch-2.2.0.jar:2.2.0]
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
    at java.lang.Thread.run(Thread.java:829) [?:?]
Yury-Fridlyand commented 2 years ago

Just faced this in another context:

opensearchsql> select 1+1, 1+1;
TransportError(500, 'IllegalArgumentException', {'error': {'type': 'IllegalArgumentException', 'reason': 'There was internal problem at backend', 'details': 'Multiple entries with same key: 1+1=2 and 1+1=2'}, 'status': 503})

With the same stacktrace.

dai-chen commented 2 years ago

Just faced this in another context:

opensearchsql> select 1+1, 1+1;
TransportError(500, 'IllegalArgumentException', {'error': {'type': 'IllegalArgumentException', 'reason': 'There was internal problem at backend', 'details': 'Multiple entries with same key: 1+1=2 and 1+1=2'}, 'status': 503})

With the same stacktrace.

Yes, I think it's the same cause here. Not sure what's use case. But I think we'd better to throw semantic exception with clear message.

Yury-Fridlyand commented 2 years ago

I think we should throw no exception.

mysql> select 1+1, 1+1;
+-----+-----+
| 1+1 | 1+1 |
+-----+-----+
|   2 |   2 |
+-----+-----+
1 row in set (0.00 sec)

Exception happens in ImmutableMap.Builder<String, ExprValue>::build() in ProjectOperator::next : https://github.com/opensearch-project/sql/blob/main/core/src/main/java/org/opensearch/sql/planner/physical/ProjectOperator.java#L55-L83 Maybe using a Multimap will help us, but this require changes in 20+ other files. Note, this function also rereferred in https://github.com/opensearch-project/sql/issues/458.

Yury-Fridlyand commented 2 years ago

Just checked a PoC by adding a zero-based counter to the map keys, it works:

mapBuilder.put(expr.getNameOrAlias() + cnt++, exprValue);

It is not returned by user (why? how?), but we need more investigation. Maybe this fix will be easier.

curl -s -XPOST "http://localhost:9200/_plugins/_sql" -H 'Content-Type: application/json' -d "{\"query\": \"SELECT 1+1, 1+1\"}"
{
  "schema": [
    {
      "name": "1+1",
      "type": "integer"
    },
    {
      "name": "1+1",
      "type": "integer"
    }
  ],
  "datarows": [
    [
      2,
      2
    ]
  ],
  "total": 1,
  "size": 1,
  "status": 200
}
dai-chen commented 2 years ago

I think we should throw no exception.

mysql> select 1+1, 1+1;
+-----+-----+
| 1+1 | 1+1 |
+-----+-----+
|   2 |   2 |
+-----+-----+
1 row in set (0.00 sec)

Exception happens in ImmutableMap.Builder<String, ExprValue>::build() in ProjectOperator::next : https://github.com/opensearch-project/sql/blob/main/core/src/main/java/org/opensearch/sql/planner/physical/ProjectOperator.java#L55-L83 Maybe using a Multimap will help us, but this require changes in 20+ other files. Note, this function also rereferred in #458.

Yes, I tested other database and they have no problem with such kind of query. I think we can add the support if this can be quick fixed, otherwise just improve the error.

MaxKsyunz commented 2 years ago

As a work around, aliasing the duplicate expressions works -- select 1+1 as A, 1+1 as B

This will be relevant for JOIN support (#49) because the joined tables can have columns with the same name.

IIRC, SQL Server resolved this by appending _<number> to each column name. We can do the same -- if we do not disambiguate non-unique expression, we are forcing client software to deal with that.