opensearch-project / sql

Query your data using familiar SQL or intuitive Piped Processing Language (PPL)
https://opensearch.org/docs/latest/search-plugins/sql/index/
Apache License 2.0
110 stars 129 forks source link

[BUG] SQL and PPL responses different types for the same things #1296

Open Yury-Fridlyand opened 1 year ago

Yury-Fridlyand commented 1 year ago

What is the bug?

PPL returns string data type for all string columns. SQL returns text for text data and keyword for all rest.

text and keyword are not the same thing and a user should be able to distinguish them.

How can one reproduce the bug?

Use sample data from integration tests, for example bank index. Check mapping for gender and for city: https://github.com/opensearch-project/sql/blob/a4f80663ecdd8cc7403d42cd7fdce06b0ccbd1fd/integ-test/src/test/resources/indexDefinitions/bank_index_mapping.json#L31-L32 https://github.com/opensearch-project/sql/blob/a4f80663ecdd8cc7403d42cd7fdce06b0ccbd1fd/integ-test/src/test/resources/indexDefinitions/bank_index_mapping.json#L19-L20 And then compare types reported by PPL

$ curl -s -XPOST http://localhost:9200/_plugins/_ppl -H 'Content-Type: application/json' -d '{"query": "source=bank | fields gender, city"}' | grep '"type"'
      "type": "string"
      "type": "string"

SQL

$ curl -s -XPOST http://localhost:9200/_plugins/_sql -H 'Content-Type: application/json' -d '{"query": "select gender, city from bank"}' | grep '"type"'
      "type": "text"
      "type": "keyword"

What is the expected behavior?

Returned type should be same for PPL and SQL. It should match index mapping for all columns given in index.

What is your host/environment?

2.x @ 662a9383e

Do you have any additional context?

Related to #1038 and to https://github.com/opensearch-project/observability/issues/1392

Yury-Fridlyand commented 1 year ago

Some technical details. On response serialization

acarbonetto commented 1 year ago

Please also make sure https://github.com/opensearch-project/sql/blob/main/docs/user/general/datatypes.rst#data-types-mapping and https://github.com/opensearch-project/sql/blob/main/docs/user/ppl/general/datatypes.rst#data-types-mapping are consistent