opensearch-project / opensearch-spark

Spark Accelerator framework ; It enables secondary indices to remote data stores.
Apache License 2.0
22 stars 33 forks source link

[FEATURE] Extra options should offer a more explicit method for users to specify the table identifier. #873

Open penghuo opened 2 weeks ago

penghuo commented 2 weeks ago

Is your feature request related to a problem? If the CREATE MATERIALIZED VIEW statement accesses the db.default.tbl-001 table and the user wants to pass stat_timestamp as an option, the table name must be enclosed in backticks in extra_options. Without backticks, the options cannot be passed correctly. The root cause is that when UnresolvedRelation.getTableName is called, any identifiers containing special characters are automatically quoted.

CREATE MATERIALIZED VIEW `db`.`default`.`mv` AS SELECT * FROM `db`.`default`.`tbl-001` WITH ( auto_refresh = true, refresh_interval = '5 Minute', extra_options = '{"db.default.`tbl-001`": {"start_timestamp": "1729493689000"}}'

What solution would you like? A more explicit method is needed for users to specify the table identifier. The expected behavior is that whatever the user specifies in the FROM clause should be used directly in extra_options. For example, in the case above, the user should specify the identifier as written in the FROM clause.

extra_options = '{"`db`.`default`.`tbl-001`": {"start_timestamp": "1729493689000"}

What alternatives have you considered? n/a

Do you have any additional context? Add any other context or screenshots about the feature request here.