elastic / elasticsearch

Free and Open Source, Distributed, RESTful Search Engine
https://www.elastic.co/products/elasticsearch
Other
69.58k stars 24.63k forks source link

Provide API to identify dangling searchable snapshots #113168

Open romain-chanu opened 16 hours ago

romain-chanu commented 16 hours ago

Description

What is a dangling searchable snapshot?

It is a searchable snapshot stored in a snapshot repository and no longer referenced/used by an Elasticsearch cluster. This can happen in the following situations:

1) Users have deleted searchable snapshot indices and/or data streams (containing searchable snapshot indices) via the delete index API or the delete data stream API.

If users manually delete an index or data stream before ILM delete phase runs, then ILM will not delete the underlying searchable snapshot. Users would need to use the Delete snapshots API to remove the searchable snapshot from the snapshot repository when it is no longer needed.

2) Users have configured the respective ILM policy with a delete phase but the delete_searchable_snapshot is set to false (c.f Delete). Users would need to use the Delete snapshots API to remove the searchable snapshot from the snapshot repository when it is no longer needed.

How to determine if a searchable snapshot is dangling?

As of the time of writing, Elasticsearch does not provide an API to retrieve such information. Manual checks need to be done which could very tedious and error-prone.

Motivation

elasticsearchmachine commented 16 hours ago

Pinging @elastic/es-distributed (Team:Distributed)

DaveCTurner commented 15 hours ago

I'm not really sure that we can reliably identify such snapshots, there's nothing particularly special about these snapshots vs any other snapshots the user might have taken.

If the user is only using SLM and ILM to take snapshots then you can identify all non-SLM snapshots with GET _snapshot/_all/_all?slm_policy_filter=_none, but ofc this will include both mounted and dangling snapshots. Would it be enough to add another filter to the get-snapshots API to exclude mounted snapshots perhaps?