opensearch-project / opensearch-spark

Spark Accelerator framework ; It enables secondary indices to remote data stores.
Apache License 2.0
12 stars 18 forks source link

[FEATURE] Enhance Alter index statement to support schema evolution #387

Open dai-chen opened 1 week ago

dai-chen commented 1 week ago

Is your feature request related to a problem?

Currently, the ALTER INDEX statement in Flint only supports changing index options, such as index refresh mode (refer to the documentation: ALTER INDEX Options). For table formats like Iceberg, which support schema evolution, there is no way for users to add more indexed columns without re-creating the index.

What solution would you like?

Introduce support for schema changes in the ALTER INDEX statement similar to the ALTER TABLE statement. This would include:

  1. ALTER SKIPPING INDEX ON <tableName> ADD COLUMN ...
  2. ALTER INDEX <indexName> ON <tableName> ADD COLUMN ...

What alternatives have you considered?

The only current alternative is for users to re-create the index whenever they need to add more indexed columns.

Do you have any additional context?

It is important to note that this change will impact the correctness of the query rewriter, as the new columns added to the Flint index will be missing in old data: