opensearch-project / sql

Query your data using familiar SQL or intuitive Piped Processing Language (PPL)
https://opensearch.org/docs/latest/search-plugins/sql/index/
Apache License 2.0
117 stars 134 forks source link

Bug Fixes for minor issues with SQL PIT refactor #3045

Closed manasvinibs closed 5 days ago

manasvinibs commented 6 days ago

Description

Issue found during E2E testing:

  1. If join queries fail to pull results due to internal errors in the service. The API fails with a Null Pointer Exception Reproducible if you force the OpenSearch client to throw an exception while executing a join.
    
    Before:

curl -X POST "http://localhost:9200/_plugins/_sql" -H 'Content-Type: application/json' -d'

{ "query": "SELECT t1.field_1, t2.field_3 FROM testpit t1 INNER JOIN testjoin t2 ON t1.field_2 = t2.field_4" } ' { "error": { "reason": "There was internal problem at backend", "details": "Cannot invoke \"java.util.List.size()\" because \"this.results\" is null", "type": "NullPointerException" }, "status": 500 }

After fix:

curl -X POST "http://localhost:9200/_plugins/_sql" -H 'Content-Type: application/json' -d' { "query": "SELECT t1.field_1, t2.field_3 FROM testpit t1 INNER JOIN testjoin t2 ON t1.field_2 = t2.field_4" } ' { "error": { "reason": "There was internal problem at backend", "details": "Error occurred during join query run", "type": "IllegalStateException" }, "status": 500 }

2. Sometimes, a cursor seems to be returned even when one is not needed eg: select a.* from my-index a  returns a cursor but select * from my-index a does not. 

**After fix:** 

curl -X POST "http://localhost:9200/_plugins/_sql" -H 'Content-Type: application/json' -d' { "query": "SELECT a.* FROM testpit a" } '

Response: { "schema": [ { "name": "field_1", "type": "text" }, { "name": "field_2", "type": "integer" } ], "total": 10, "datarows": [ [ "Sample data 1", 1 ], [ "Sample data 2", 2 ], [ "Sample data 3", 3 ], [ "Sample data 4", 4 ], [ "Sample data 5", 5 ], [ "Sample data 6", 6 ], [ "Sample data 7", 7 ], [ "Sample data 8", 8 ], [ "Sample data 9", 9 ], [ "Sample data 10", 10 ] ], "size": 10, "status": 200 }

curl -X POST "http://localhost:9200/_plugins/_sql" -H 'Content-Type: application/json' -d' { "query": "SELECT a.* FROM testpit a" } '

Response: { "schema": [ { "name": "field_1", "type": "text" }, { "name": "field_2", "type": "integer" } ], "total": 10, "datarows": [ [ "Sample data 1", 1 ], [ "Sample data 2", 2 ], [ "Sample data 3", 3 ], [ "Sample data 4", 4 ], [ "Sample data 5", 5 ], [ "Sample data 6", 6 ], [ "Sample data 7", 7 ], [ "Sample data 8", 8 ], [ "Sample data 9", 9 ], [ "Sample data 10", 10 ] ], "size": 10, "status": 200 }

3. The api seems to have an unsupported "size" parameter where it generates a cursor. The cursor is unusable, attempting to use it causes an exception.
Reproducible by specifying a "size" parameter in the request body in place of "fetch_size".

After fix, response doesn't contain incorrect default cursor in the response. It just ignores the unsupported fields in the query which is existing behavior.

curl -X POST "http://localhost:9200/_plugins/_sql" -H 'Content-Type: application/json' -d' { "query": "SELECT field_1 FROM testpit WHERE field_2 > 0", "size": 5 } ' { "schema": [{ "name": "field_1", "type": "text" }], "total": 10, "datarows": [ ["Sample data 1"], ["Sample data 2"], ["Sample data 3"], ["Sample data 4"], ["Sample data 5"], ["Sample data 6"], ["Sample data 7"], ["Sample data 8"], ["Sample data 9"], ["Sample data 10"] ], "size": 10, "status": 200 }

4. From the testing, it appears the v2 engine always appears to always create a PIT context even when one is not needed (could explain 3)

This was addressed as part of refactorign PR only in the revised iterations. Here is the code ref for v2 PIT changes where PIT is created only when Scroll was used before. https://github.com/opensearch-project/sql/blob/main/opensearch/src/main/java/org/opensearch/sql/opensearch/request/OpenSearchRequestBuilder.java#L124


<!-- List any other related issues here -->

### Check List
- [X] New functionality includes testing.
- [X] Commits are signed per the DCO using `--signoff`.
- [X] Public documentation issue/PR [created](https://github.com/opensearch-project/documentation-website/issues/new/choose).

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check [here](https://github.com/opensearch-project/sql/blob/main/CONTRIBUTING.md#developer-certificate-of-origin).