opensearch-project / opensearch-spark

Spark Accelerator framework ; It enables secondary indices to remote data stores.
Apache License 2.0
22 stars 33 forks source link

Support alter from manual to auto without specifying scheduler mode #930

Closed noCharger closed 2 days ago

noCharger commented 2 days ago

Description

Support alter from manual to auto without specifying scheduler mode. In detail:

Step 1: Create a manual refresh index (full or incremental)

Step 2: Alter it by set auto_refreh=true, by default it will switch to auto refresh mode and use external scheduler when spark.flint.job.externalScheduler.enabled = true

Check List

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. For more information on following Developer Certificate of Origin and signing off your commits, please check here.

noCharger commented 2 days ago

What happens in these 2 scenarios?

Scenario 1

  1. create index with "auto_refresh": true and "scheduler_mode": "external"
  2. alter index with "auto_refresh": false
  3. alter index with "auto_refresh": true

Scenario 2

  1. create index with "auto_refresh": true and "scheduler_mode": "internal"
  2. alter index with "auto_refresh": false
  3. alter index with "auto_refresh": true

Is it intended that in both scenarios the resulting index should be auto refresh index using external scheduler? Does the condition check in isSchedulerModeChanged cover this case? I believe originalOptions will still have "scheduler_mode": "internal"

If scheduler_mode is already exist in original index option, it should copied to the updated index and behave as of expacted, hense Scenario 2 should be internal instead. The case to cover in this PR is when original index option doesn't have scheduler_mode field.