opensearch-project / sql

Query your data using familiar SQL or intuitive Piped Processing Language (PPL)
https://opensearch.org/docs/latest/search-plugins/sql/index/
Apache License 2.0
120 stars 139 forks source link

[FEATURE] Make SQL align with PPL to support evaluate nested fields and update fields. #2933

Open qianheng-aws opened 3 months ago

qianheng-aws commented 3 months ago

Is your feature request related to a problem? Currently in OpenSearch, SQL doesn't support update or evaluate nested fields like PPL. To make them align, we may need to make SQL support these features as well.

e.g.

  1. Update fields PPL supports updating the fields while SQL produces a new field with the same field_name which may leads to error.
    
    # PPL
    POST _plugins/_ppl/_explain
    {
    "query": """
    source=opensearch_dashboards_sample_data_flights | eval FlightTimeMin = 1
    """
    }

response

{ "root": { "name": "ProjectOperator", "description": { "fields": "[FlightNum, Origin, OriginLocation, DestLocation, FlightDelay, DistanceMiles, FlightTimeMin, OriginWeather, dayOfWeek, AvgTicketPrice, Carrier, FlightDelayMin, OriginRegion, DestAirportID, FlightDelayType, timestamp, Dest, FlightTimeHour, Cancelled, DistanceKilometers, OriginCityName, DestWeather, OriginCountry, DestCountry, DestRegion, DestCityName, OriginAirportID]" }, "children": [ { "name": "EvalOperator", "description": { "expressions": { "FlightTimeMin": "1" } }, "children": [ { "name": "OpenSearchIndexScan", "description": { "request": """OpenSearchQueryRequest(indexName=opensearch_dashboards_sample_data_flights, sourceBuilder={"from":0,"size":10000,"timeout":"1m"}, searchDone=false)""" }, "children": [] } ] } ] } }

SQL

POST _plugins/_sql { "query": "SELECT *, 1 as FlightTimeMin FROM opensearch_dashboards_sample_data_flights LIMIT 5" }

response error

"{\n \"error\": {\n \"reason\": \"Invalid SQL query\",\n \"details\": \"Multiple entries with same key: FlightTimeMin=1 and FlightTimeMin=1030.7704\",\n \"type\": \"IllegalArgumentException\"\n },\n \"status\": 400\n}"


3. Evaluate nested fields
PPL supports evaluating fields from left to right while SQL cannot resolve the reference.

PPL

POST _plugins/_ppl/_explain { "query": """ source=opensearch_dashboards_sample_data_flights | eval FlightTimeMinPlusOne = FlightTimeMin + 1, FlightTimeMinPlusTwo = FlightTimeMinPlusOne + 1 | fields FlightTimeMinPlusOne, FlightTimeMinPlusTwo """ }

response

{ "root": { "name": "ProjectOperator", "description": { "fields": "[FlightTimeMinPlusOne, FlightTimeMinPlusTwo]" }, "children": [ { "name": "EvalOperator", "description": { "expressions": { "FlightTimeMinPlusTwo": "+(FlightTimeMinPlusOne, 1)", "FlightTimeMinPlusOne": "+(FlightTimeMin, 1)" } }, "children": [ { "name": "OpenSearchIndexScan", "description": { "request": """OpenSearchQueryRequest(indexName=opensearch_dashboards_sample_data_flights, sourceBuilder={"from":0,"size":10000,"timeout":"1m"}, searchDone=false)""" }, "children": [] } ] } ] } }

SQL

POST _plugins/_sql/_explain { "query": "SELECT (FlightTimeMin + 1) as FlightTimeMinPlusOne, (FlightTimeMinPlusOne + 1) as FlightTimeMinPlusTwo FROM opensearch_dashboards_sample_data_flights LIMIT 5" }

response error

"{\n \"error\": {\n \"reason\": \"Invalid SQL query\",\n \"details\": \"can't resolve Symbol(namespace=FIELD_NAME, name=FlightTimeMinPlusOne) in type env\",\n \"type\": \"SemanticCheckException\"\n },\n \"status\": 400\n}"



**What solution would you like?**
These 2 features are all supported in EvalOperator. But when compiling SQL, it won't use EvalOperator now but only ProjectOperator, which contributes to the differences above. 

To make them align,  there could be 2 optional solutions:
1. Make SQL generates EvalOperator as well like PPL
2. Enhance ProjectOperator to support these 2 features.

**What alternatives have you considered?**
A clear and concise description of any alternative solutions or features you've considered.

**Do you have any additional context?**
Add any other context or screenshots about the feature request here.
dblock commented 2 months ago

[Weekly Catch All Triage - 1]