apache / hudi

Upserts, Deletes And Incremental Processing on Big Data.
https://hudi.apache.org/
Apache License 2.0
5.45k stars 2.42k forks source link

[HUDI-8534] Ensure index creation is idempotent in face of failures #12308

Open lokeshj1703 opened 11 hours ago

lokeshj1703 commented 11 hours ago

Change Logs

We need to ensure the user can execute CREATE INDEX name .... with same name, failing many times and then be able to eventually succeed with the right syntax/parameters.

spark-sql (default)> create index idx_bloom on hudi_table using bloom_filter(city) options(func='lower');

24/11/16 09:58:44 ERROR SparkSQLDriver: Failed in [create index idx_bloom on hudi_table using bloom_filter(city) options(func='lower')]
java.lang.IllegalArgumentException: The value of hoodie.functional.index.type should be one of COLUMN_STATS,BLOOM_FILTERS,SECONDARY_INDEX, but was bloom_filter

spark-sql (default)> create index idx_bloom on hudi_table using bloom_filters(city) options(func='lower');
24/11/16 09:59:11 ERROR SparkSQLDriver: Failed in [create index idx_bloom on hudi_table using bloom_filters(city) options(func='lower')]
java.lang.IllegalArgumentException: The value of hoodie.functional.index.type should be one of COLUMN_STATS,BLOOM_FILTERS,SECONDARY_INDEX, but was bloom_filter

The PR ensures that the metadata related to index is deleted in case creation fails. For this we make a call to drop the corresponding index if its creation fails. The drop call ensures that any metadata for the index is deleted.

Impact

NA

Risk level (write none, low medium or high below)

low

Documentation Update

NA

Contributor's checklist

hudi-bot commented 10 hours ago

CI report:

Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build