opensearch-project / OpenSearch

🔎 Open source distributed and RESTful search engine.
https://opensearch.org/docs/latest/opensearch/index/
Apache License 2.0
9.83k stars 1.83k forks source link

[Joins] Join Query DSL #15450

Open harshavamsi opened 2 months ago

harshavamsi commented 2 months ago

Is your feature request related to a problem? Please describe

Coming from #15185 , we want to introduce the join DSL format that will be used to construct the join query. It will make use of the existing QueryBuilders within OpenSearch to parse the left and right queries. We will add new logic to SearchSourceBuilder to support the new join field in the query DSL.

Describe the solution you'd like

The join field will be parsed by a new JoinBuilder in OpenSearch that will take in the following:

Full query DSL

{  
  "query": {  
    "bool": {  
      "filter": [  
        {  
          "range": {  
            "@timestamp": {  
              "gte": "now-1h"  
            }  
          }  
        },  
        {  
          "match": {  
            "message": "error"  
          }  
        }  
      ]  
    }  
  },  
  "fields": ["instance_id", "status_code"],  
  "join": {  
    "right_query": {  
        "index": "instance_details",   
        "query": {  
          "range": {  
            "created_at": {  
              "gte": "now-1y"  
            }  
          }  
        },  
        "fields": ["instance_id", "region"]  
    },  
    "type": "inner",   
    "algorithm": "hash_join", // optional  
    "condition": {  
        "left_field": "instance_id",  
        "right_field": "instance_id",  
        "comparator": "="  
    },  
    "fields": ["region", "status_code"],  
    "aggs": {  
      "by_region": {  
        "terms": {  
          "field": "region"  
        },  
        "aggs": {  
          "by_status_code": {  
            "terms": {  
              "field": "status_code"  
            },  
            "aggs": {  
              "status_code_count": {  
                "value_count": {  
                  "field": "status_code"  
                }  
              }  
            }  
          }  
        }  
      }  
    }  
  }  
}

Related component

Search:Query Capabilities

Describe alternatives you've considered

No response

Additional context

No response

smacrakis commented 2 months ago

Small comment: why do we speak of the left and right queries? In SQL, the left and right objects are normally called "tables". The result sets to join may be defined by table names or by subqueries. The usual equivalent of "table" in OpenSearch is "index", but that term is so overloaded that it's best avoided. Wouldn't it be clearest for people who are familiar with SQL to use the standard SQL terminology, namely tables?

bowenlan-amzn commented 1 month ago

I am working on this now as part of the join request response workflow.