opensearch-project / opensearch-spark

Spark Accelerator framework ; It enables secondary indices to remote data stores.
Apache License 2.0
14 stars 20 forks source link

[FEATURE] Support expression in indexed column #204

Open dai-chen opened 8 months ago

dai-chen commented 8 months ago

Is your feature request related to a problem?

Currently, only column name is accepted as indexed column for both skipping and covering index. In certain case, user may want to transform the column value before indexing.

What solution would you like?

Support expression in indexed column:

CREATE SKIPPING INDEX ON test
( expr(col) VALUE_SET ... )
dai-chen commented 8 months ago

This may be useful for certain use case such as IP address: https://github.com/opensearch-project/opensearch-spark/issues/203. For such specific field, user can do some cast/transform before indexing it via value set or bloom filter for instance.

dai-chen commented 1 month ago

Covering index may also have the same requirement.