opensearch-project / sql

Query your data using familiar SQL or intuitive Piped Processing Language (PPL)
https://opensearch.org/docs/latest/search-plugins/sql/index/
Apache License 2.0
116 stars 134 forks source link

[Discussion] SQL dialect #855

Open Yury-Fridlyand opened 1 year ago

Yury-Fridlyand commented 1 year ago

OpenSearch SQL extends ANSI SQL. I propose not to create a new dialect, but align with existing one. Some features already follow MySQL syntax and implementation. MySQL was selected because it is one of the most popular. The discussion in this thread should confirm (or not) following MySQL dialect. Any procs, cons, ideas, objectives. To finalize this task we need to update documentation and readme files.

Yury-Fridlyand commented 1 year ago

See also: #852

MaxKsyunz commented 1 year ago

Thank you @Yury-Fridlyand for starting this discussion -- it's important one to have and document.

In my view, SQL plugin should be consistent with other parts of OpenSearch and development ecosystem first and foremost. In particular,

  1. Any query written in SQL and equivalent OpenSearch DSL should produce the same result.
  2. Implementation should make sense in the context of Java.

Violating either of these will be a source of bugs for us and create pitfalls for end-users and application developers.

852 is a good example of this. To support TIME values > 24 hours we will have to add a custom data type instead of using java.sql.Time. It will force JDBC clients to special case such fields as well. That's all error-prone and expensive, what use case does TIME > 24 hours enable?

From another angle, OpenSearch SQL is already meaningfully incompatible[*] with MySQL. Given that, let's focus on expanding the feature set as easily as we can in a consistent manner.

[*] There's a valid OpenSearch SQL query that is invalid MySQL query and vice versa -- there's a valid MySQL query that is invalid OpenSearch SQL.

Yury-Fridlyand commented 1 year ago

We had a discussion outside of GH and confirmed that parser and constructor-like functions (TIME, DATE, DATETIME, TIMESTAMP) should throw an exception on invalid string input, instead of returning NULL. Even though we are trying to align with MySQL, this behavior contradicts MySQL.

image