opensearch-project / opensearch-spark

Spark Accelerator framework ; It enables secondary indices to remote data stores.
Apache License 2.0
12 stars 18 forks source link

[BUG] Flint index stuck in refreshing state when refresh job exits early with exception #368

Closed dai-chen closed 2 weeks ago

dai-chen commented 3 weeks ago

What is the bug?

The PR https://github.com/opensearch-project/opensearch-spark/issues/361 addressed the Flint index state update when a streaming job terminates with an exception. However, an edge case was found where the streaming job exits earlier than the new awaitMonitor call in the FlintJob. In such cases, the index state transition logic in the awaitMonitor API is not executed.

How can one reproduce the bug? Steps to reproduce the behavior in IT:

  1. Create a Flint index with auto refresh enabled
  2. Stop the streaming job behind it by Spark API
  3. Call the awaitMonitor API
  4. Check that the index state remains REFRESHING

What is the expected behavior?

Ensure that the Flint index state is updated to FAILED even if the streaming job exits early and the awaitMonitor API is called subsequently.

Do you have any screenshots?

N/A

Do you have any additional context?

N/A