opensearch-project / opensearch-spark

Spark Accelerator framework ; It enables secondary indices to remote data stores.
Apache License 2.0
22 stars 33 forks source link

Extract source table names from mv query #854

Closed seankao-az closed 3 weeks ago

seankao-az commented 3 weeks ago

Description

Extract source table names from mv query and add to _meta.properties.sourceTables

This is actually already introduced in #297, but we used it only for validation and didn't store the table names

Example

CREATE MATERIALIZED VIEW myglue_test.default.mv_test_source_tables
AS SELECT status FROM myglue_test.default.http_logs
WITH (
  auto_refresh = true,
  checkpoint_location = 's3://test/'
)

GET flint_myglue_test_default_mv_test_source_tables/_mappings
{
  "flint_myglue_test_default_mv_test_source_tables": {
    "mappings": {
      "_meta": {
        "latestId": "ZmxpbnRfbXlnbHVlX3Rlc3RfZGVmYXVsdF9tdl90ZXN0X3NvdXJjZV90YWJsZXM=",
        "kind": "mv",
        "indexedColumns": [
          {
            "columnType": "int",
            "columnName": "status"
          }
        ],
        "name": "myglue_test.default.mv_test_source_tables",
        "options": {
          "auto_refresh": "true",
          "refresh_interval": "5 minutes",
          "scheduler_mode": "external",
          "incremental_refresh": "false",
          "checkpoint_location": "s3://test/"
        },
        "source": "SELECT status FROM myglue_test.default.http_logs",
        "version": "0.6.0",
        "properties": {
          "sourceTables": [
            "myglue_test.default.http_logs"
          ],
          "env": {
            "SERVERLESS_EMR_VIRTUAL_CLUSTER_ID": "****",
            "SERVERLESS_EMR_JOB_ID": "****"
          }
        }
      },
      "properties": {
        "status": {
          "type": "integer"
        }
      }
    }
  }
}

Related Issues

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. For more information on following Developer Certificate of Origin and signing off your commits, please check here.