elastic / kibana

Your window into the Elastic Stack
https://www.elastic.co/products/kibana
Other
19.62k stars 8.22k forks source link

[Fleet] Delete old indexes on migration of package transforms #162889

Open kcreddy opened 1 year ago

kcreddy commented 1 year ago

While working on a feature for expiring indicators of compromise (IOC) of threat intel (TI) packages, we took an approach that involves latest transform. Essentially the transform (example transform for ti_recordedfuture) is going to create destination indices (*_latest*) which are used by the users to setup detection rules. The destination index itself is versioned with a version suffix such as -1, -2,.. so that we can support any breaking change made to the index mapping.

Now the issue is this that when a destination index is eventually bumped up (lets say from -1 to -2), the current behaviour doesn't delete the old destination index which is now orphaned as it has no links to the updated transform. This creates duplicates for readers using wildcards (logs-*_latest-*) or even in the false detection alerts when used with wildcards.

We need a mechanism to automatically purge the old destination index so that readers using wildcards (logs-*_latest-*) are not getting results from the old index.

juliaElastic commented 1 year ago

hey @jsoriano could you take a look at this to advise?

jsoriano commented 1 year ago

Now the issue is this that when a destination index is eventually bumped up (lets say from -1 to -2), the current behaviour doesn't delete the old destination index which is now orphaned as it has no links to the updated transform.

@kcreddy when does this bumping up happen? Does it happen when a new version of the package is installed?

kcreddy commented 1 year ago

when does this bumping up happen? Does it happen when a new version of the package is installed?

@jsoriano, Yes it happens when new package version is installed. For example we have this PR for fixing the destination index mapping and we would like to increment the index version. https://github.com/elastic/integrations/pull/6514/files#diff-9c0b6d05505edf894e674f59e9af2d5721e621526b356cb29c8f8dbbbc603814

jsoriano commented 1 year ago

when does this bumping up happen? Does it happen when a new version of the package is installed?

@jsoriano, Yes it happens when new package version is installed.

I guess that then EPM could take care of the migration to the new version and the deletion of old indexes, as part of the package installation process. @juliaElastic wdyt?

juliaElastic commented 1 year ago

@jsoriano You mean the package installation logic in kibana, right? I agree, we could enhance that to delete old index. Can we transfer this issue to kibana repo then?

jsoriano commented 1 year ago

Yes, in the package installation logic. Transferred issue to kibana. Thanks!

elasticmachine commented 1 year ago

Pinging @elastic/fleet (Team:Fleet)

kcreddy commented 8 months ago

Along with the version being incremented, the destination index should be also dynamic with namespace configuration. Right now, the destination index name is hardcoded which is leading to issues when used with multiple namespaces.

There was a usecase where the user is trying to setup ti_misp integration with multiple namespaces. In that case, the transform would fail to index the documents into destination index because the data_stream.namespace is a constant_keyword field and should be same across all documents in destination index. But since multiple namespaces are setup and trying to ingest into same destination index, the index operation fails and subsequently failing the transform.

Here is an error:

Failed to index documents into destination index due to permanent error: [org.elasticsearch.xpack.transform.transforms.BulkIndexingException: Bulk index experienced [1300] failures and at least 1 irrecoverable [org.elasticsearch.xpack.transform.transforms.TransformException: Destination index mappings are incompatible with the transform configuration.; org.elasticsearch.index.mapper.DocumentParsingException: [1:297] failed to parse field [data_stream.namespace] of type [constant_keyword] in document with id 'dF8wqqJmx1vVZdvyQIV7SiCZLQAAAAAA'. Preview of field's value: 'default'; java.lang.IllegalArgumentException: [constant_keyword] field [data_stream.namespace] only accepts values that are equal to the value defined in the mappings [testnamesp], but got [default]].; org.elasticsearch.xpack.transform.transforms.TransformException: Destination index mappings are incompatible with the transform configuration.; org.elasticsearch.index.mapper.DocumentParsingException: [1:297] failed to parse field [data_stream.namespace] of type [constant_keyword] in document with id 'dF8wqqJmx1vVZdvyQIV7SiCZLQAAAAAA'. Preview of field's value: 'default'; java.lang.IllegalArgumentException: [constant_keyword] field [data_stream.namespace] only accepts values that are equal to the value defined in the mappings [testnamesp], but got [default]]

Steps to reproduce:

  1. Install MISP integration with MISP Attributes datastream with namespace as testnamesp
  2. The transform runs successfully and indexes documents into destination index.
  3. Setup another MISP integration policy for MISP Attributes datastream but with different namespace, say default.
  4. This time the transform fails with an error to index documents into destination index.
jen-huang commented 7 months ago

@kcreddy @andrewkroh This is pretty low on our list of priorities at the moment and we don't have much domain knowledge about transforms. I think the Security team contributed the work for installing transforms initially. We are happy to provide guidance and reviews if this is a high priority change that you would like to try to tackle.