`EnrichPolicyMaintenanceService` can delete new enrichment index before it's promoted

Elasticsearch Version

7.17.4

Installed Plugins

No response

Java Version

bundled

OS Version

RHEL 7

Problem Description

During a subsequent enrichment policy execution, EnrichPolicyMaintenanceService can flag the new enrichment index as stale while it's being indexed to. This can lead to the new index being created a second time (due to cause [auto(bulk api)]), which will also get deleted during the next enrich.cleanup_period.

Steps to Reproduce

In the below logs, a second execution of my_policy was executing during a maintenance cycle. .enrich-my_policy-1654635601016 was still receiving data so it hadn't been promoted, resulting in it getting deleted:

"Jun 7, 2022 @ 17:00:43.113","[2022-06-07T21:00:38,232][DEBUG][org.elasticsearch.xpack.enrich.EnrichPolicyMaintenanceService] [instance-0000000002] Checking if should remove enrich index [.enrich-my_policy-1654635601016]"
"Jun 7, 2022 @ 17:00:43.113","[2022-06-07T21:00:38,232][DEBUG][org.elasticsearch.xpack.enrich.EnrichPolicyMaintenanceService] [instance-0000000002] Enrich index [.enrich-my_policy-1654635601016] is not marked as a live index since it has no alias information"
"Jun 7, 2022 @ 17:00:43.113","[2022-06-07T21:00:38,233][INFO ][org.elasticsearch.cluster.metadata.MetadataDeleteIndexService] [instance-0000000002] [.enrich-my_policy-1654635601016/EcveiRTjT9WsK0h8E5xROg] deleting index"
"Jun 7, 2022 @ 17:00:43.113","[2022-06-07T21:00:38,568][DEBUG][org.elasticsearch.xpack.enrich.EnrichPolicyMaintenanceService] [instance-0000000002] Completed deletion of stale enrich indices [[.enrich-my_policy-1654635601016]]"

A second .enrich-my_policy-1654635601016 was created due to the on-going reindex process:

"Jun 7, 2022 @ 17:00:43.113","[2022-06-07T21:00:38,570][INFO ][org.elasticsearch.cluster.metadata.MetadataCreateIndexService] [instance-0000000002] [.enrich-my_policy-1654635601016] creating index, cause [auto(bulk api)], templates [], shards [1]/[1]"

Reindex continues and the index is promoted:

"Jun 7, 2022 @ 17:03:12.264","[2022-06-07T21:03:09,891][DEBUG][org.elasticsearch.xpack.enrich.EnrichPolicyRunner] [instance-0000000012] Policy [my_policy]: Promoting new enrich index [.enrich-my_policy-1654635601016] to alias [.enrich-my_policy]"
"Jun 7, 2022 @ 17:03:12.264","[2022-06-07T21:03:09,973][INFO ][org.elasticsearch.xpack.enrich.EnrichPolicyRunner] [instance-0000000012] Policy [my_policy]: Policy execution complete"

The second .enrich-my_policy-1654635601016 gets deleted because it it doesn't have a policy name in the index settings:

"Jun 7, 2022 @ 17:15:43.383","[2022-06-07T21:15:38,235][DEBUG][org.elasticsearch.xpack.enrich.EnrichPolicyMaintenanceService] [instance-0000000002] Enrich index [.enrich-my_policy-1654635601016] does not correspond to any existing policy. Found policy name [null]"
"Jun 7, 2022 @ 17:15:43.384","[2022-06-07T21:15:38,236][INFO ][org.elasticsearch.cluster.metadata.MetadataDeleteIndexService] [instance-0000000002] [.enrich-my_policy-1654529338711/FkyJRz_lS6eiK0TzIvj24A] deleting index"

Logs (if relevant)

"Jun 7, 2022 @ 17:00:04.247","[2022-06-07T21:00:01,013][INFO ][org.elasticsearch.xpack.enrich.EnrichPolicyRunner] [instance-0000000012] Policy [my_policy]: Running enrich policy"
"Jun 7, 2022 @ 17:00:04.247","[2022-06-07T21:00:01,013][DEBUG][org.elasticsearch.xpack.enrich.EnrichPolicyRunner] [instance-0000000012] Policy [my_policy]: Checking source indices [[my_source_index]]"
"Jun 7, 2022 @ 17:00:04.247","[2022-06-07T21:00:01,015][DEBUG][org.elasticsearch.xpack.enrich.EnrichPolicyRunner] [instance-0000000012] Policy [my_policy]: Validating [[my_source_index]] source mappings"
"Jun 7, 2022 @ 17:00:04.247","[2022-06-07T21:00:01,016][DEBUG][org.elasticsearch.xpack.enrich.EnrichPolicyRunner] [instance-0000000012] Policy [my_policy]: Creating new enrich index [.enrich-my_policy-1654635601016]"
"Jun 7, 2022 @ 17:00:04.247","[2022-06-07T21:00:01,135][DEBUG][org.elasticsearch.xpack.enrich.EnrichPolicyRunner] [instance-0000000012] Policy [my_policy]: Transferring source data to new enrich index [.enrich-my_policy-1654635601016]"
"Jun 7, 2022 @ 17:00:08.111","[2022-06-07T21:00:01,020][INFO ][org.elasticsearch.cluster.metadata.MetadataCreateIndexService] [instance-0000000002] [.enrich-my_policy-1654635601016] creating index, cause [api], templates [], shards [1]/[0]"
"Jun 7, 2022 @ 17:00:08.111","[2022-06-07T21:00:01,092][INFO ][org.elasticsearch.cluster.routing.allocation.AllocationService] [instance-0000000002] Cluster health status changed from [YELLOW] to [GREEN] (reason: [shards started [[.enrich-my_policy-1654635601016][0]]])."
"Jun 7, 2022 @ 17:00:43.113","[2022-06-07T21:00:42,323][INFO ][org.elasticsearch.cluster.routing.allocation.AllocationService] [instance-0000000002] Cluster health status changed from [YELLOW] to [GREEN] (reason: [shards started [[.enrich-my_policy-1654635601016][0]]])."
"Jun 7, 2022 @ 17:00:43.113","[2022-06-07T21:00:38,232][DEBUG][org.elasticsearch.xpack.enrich.EnrichPolicyMaintenanceService] [instance-0000000002] Checking if should remove enrich index [.enrich-my_policy-1654529338711]"
"Jun 7, 2022 @ 17:00:43.113","[2022-06-07T21:00:38,232][DEBUG][org.elasticsearch.xpack.enrich.EnrichPolicyMaintenanceService] [instance-0000000002] Enrich index [.enrich-my_policy-1654529338711] was spared since it is associated with the valid policy [my_policy] and references alias [.enrich-my_policy]"
"Jun 7, 2022 @ 17:00:43.113","[2022-06-07T21:00:38,232][DEBUG][org.elasticsearch.xpack.enrich.EnrichPolicyMaintenanceService] [instance-0000000002] Checking if should remove enrich index [.enrich-my_policy-1654635601016]"
"Jun 7, 2022 @ 17:00:43.113","[2022-06-07T21:00:38,232][DEBUG][org.elasticsearch.xpack.enrich.EnrichPolicyMaintenanceService] [instance-0000000002] Enrich index [.enrich-my_policy-1654635601016] is not marked as a live index since it has no alias information"
"Jun 7, 2022 @ 17:00:43.113","[2022-06-07T21:00:38,233][INFO ][org.elasticsearch.cluster.metadata.MetadataDeleteIndexService] [instance-0000000002] [.enrich-my_policy-1654635601016/EcveiRTjT9WsK0h8E5xROg] deleting index"
"Jun 7, 2022 @ 17:00:43.113","[2022-06-07T21:00:38,568][DEBUG][org.elasticsearch.xpack.enrich.EnrichPolicyMaintenanceService] [instance-0000000002] Completed deletion of stale enrich indices [[.enrich-my_policy-1654635601016]]"
"Jun 7, 2022 @ 17:00:43.113","[2022-06-07T21:00:38,570][INFO ][org.elasticsearch.cluster.metadata.MetadataCreateIndexService] [instance-0000000002] [.enrich-my_policy-1654635601016] creating index, cause [auto(bulk api)], templates [], shards [1]/[1]"
"Jun 7, 2022 @ 17:00:43.113","[2022-06-07T21:00:39,398][INFO ][org.elasticsearch.cluster.metadata.MetadataMappingService] [instance-0000000002] [.enrich-my_policy-1654635601016/stBKskt-RHex2kEUKe4wdg] create_mapping [_doc]"
"Jun 7, 2022 @ 17:00:43.113","[2022-06-07T21:00:39,743][INFO ][org.elasticsearch.cluster.metadata.MetadataMappingService] [instance-0000000002] [.enrich-my_policy-1654635601016/stBKskt-RHex2kEUKe4wdg] update_mapping [_doc]"
"Jun 7, 2022 @ 17:00:43.113","[2022-06-07T21:00:40,753][INFO ][org.elasticsearch.cluster.metadata.MetadataMappingService] [instance-0000000002] [.enrich-my_policy-1654635601016/stBKskt-RHex2kEUKe4wdg] update_mapping [_doc]"
"Jun 7, 2022 @ 17:02:22.261","[2022-06-07T21:02:17,214][INFO ][org.elasticsearch.xpack.enrich.EnrichPolicyRunner] [instance-0000000012] Policy [my_policy]: Transferred [####] documents to enrich index [.enrich-my_policy-1654635601016]"
"Jun 7, 2022 @ 17:02:22.261","[2022-06-07T21:02:17,214][DEBUG][org.elasticsearch.xpack.enrich.EnrichPolicyRunner] [instance-0000000012] Policy [my_policy]: Force merging newly created enrich index [.enrich-my_policy-1654635601016] (Attempt 1/3)"
"Jun 7, 2022 @ 17:02:47.262","[2022-06-07T21:02:39,808][DEBUG][org.elasticsearch.xpack.enrich.EnrichPolicyRunner] [instance-0000000012] Policy [my_policy]: Refreshing enrich index [.enrich-my_policy-1654635601016]"
"Jun 7, 2022 @ 17:02:47.262","[2022-06-07T21:02:39,815][DEBUG][org.elasticsearch.xpack.enrich.EnrichPolicyRunner] [instance-0000000012] Policy [my_policy]: Setting new enrich index [.enrich-my_policy-1654635601016] to be read only"
"Jun 7, 2022 @ 17:02:48.118","[2022-06-07T21:02:39,820][INFO ][org.elasticsearch.cluster.routing.allocation.AllocationService] [instance-0000000002] updating number_of_replicas to [n] for indices [.enrich-my_policy-1654635601016]"
"Jun 7, 2022 @ 17:03:12.264","[2022-06-07T21:03:09,891][DEBUG][org.elasticsearch.xpack.enrich.EnrichPolicyRunner] [instance-0000000012] Policy [my_policy]: Promoting new enrich index [.enrich-my_policy-1654635601016] to alias [.enrich-my_policy]"
"Jun 7, 2022 @ 17:03:12.264","[2022-06-07T21:03:09,973][INFO ][org.elasticsearch.xpack.enrich.EnrichPolicyRunner] [instance-0000000012] Policy [my_policy]: Policy execution complete"
"Jun 7, 2022 @ 17:03:12.264","[2022-06-07T21:03:09,973][DEBUG][org.elasticsearch.xpack.enrich.action.InternalExecutePolicyAction] [instance-0000000012] successfully executed policy [my_policy]"
"Jun 7, 2022 @ 17:03:23.120","[2022-06-07T21:03:14,287][INFO ][org.elasticsearch.cluster.routing.allocation.AllocationService] [instance-0000000002] Cluster health status changed from [YELLOW] to [GREEN] (reason: [shards started [[.enrich-my_policy-1654635601016][0]]])."
"Jun 7, 2022 @ 17:15:43.383","[2022-06-07T21:15:38,235][DEBUG][org.elasticsearch.xpack.enrich.EnrichPolicyMaintenanceService] [instance-0000000002] Checking if should remove enrich index [.enrich-my_policy-1654529338711]"
"Jun 7, 2022 @ 17:15:43.383","[2022-06-07T21:15:38,235][DEBUG][org.elasticsearch.xpack.enrich.EnrichPolicyMaintenanceService] [instance-0000000002] Enrich index [.enrich-my_policy-1654529338711] is not marked as a live index since it has no alias information"
"Jun 7, 2022 @ 17:15:43.383","[2022-06-07T21:15:38,235][DEBUG][org.elasticsearch.xpack.enrich.EnrichPolicyMaintenanceService] [instance-0000000002] Checking if should remove enrich index [.enrich-my_policy-1654635601016]"
"Jun 7, 2022 @ 17:15:43.383","[2022-06-07T21:15:38,235][DEBUG][org.elasticsearch.xpack.enrich.EnrichPolicyMaintenanceService] [instance-0000000002] Enrich index [.enrich-my_policy-1654635601016] does not correspond to any existing policy. Found policy name [null]"
"Jun 7, 2022 @ 17:15:43.384","[2022-06-07T21:15:38,236][INFO ][org.elasticsearch.cluster.metadata.MetadataDeleteIndexService] [instance-0000000002] [.enrich-my_policy-1654529338711/FkyJRz_lS6eiK0TzIvj24A] deleting index"
"Jun 7, 2022 @ 17:15:43.384","[2022-06-07T21:15:38,236][INFO ][org.elasticsearch.cluster.metadata.MetadataDeleteIndexService] [instance-0000000002] [.enrich-my_policy-1654635601016/stBKskt-RHex2kEUKe4wdg] deleting index"
"Jun 7, 2022 @ 17:15:43.384","[2022-06-07T21:15:38,428][DEBUG][org.elasticsearch.xpack.enrich.EnrichPolicyMaintenanceService] [instance-0000000002] Completed deletion of stale enrich indices [[.enrich-my_policy-1654529338711, .enrich-my_policy-1654635601016]]

elastic / elasticsearch