[Fleet] Add support for customizing integration data streams at more levels of granularity

joshdover commented 1 year ago

We currently support customizing data stream settings and mappings via the @custom component templates that Fleet creates during package installation. Today, these are only supported on a per-data stream basis. This limits the ability to update settings for a group of related data streams and makes the process much more tedious.

Integration users want to be able to define mappings and index settings that get applied to different groups of indices, at different levels of granularity:

global (eg *)
per-type (eg. logs-*)
per-package (eg. *-nginx.*-*)
per-dataset (eg. logs-nginx.access-*) - we support this today
per-dataset-per-namespace (eg. logs-nginx.access-foo)
per-namespace (eg. *-*-foo)

One reason we haven't supported more levels of granularity is that we don't want to create 100s of unused component templates that clutter the UI and/or confuse users. In https://github.com/elastic/elasticsearch/issues/92426 and the related PR https://github.com/elastic/elasticsearch/pull/92436, Elasticsearch will add the ability for an index template to reference component templates that may not yet exist. This would allow us to create the custom component templates only on-demand when a user wants to apply a setting or mapping to given group of data streams.

For instance, this would allow us to create index templates like this during package installation:

PUT _index_template/logs-nginx.access-foo?ignore_missing=true
{
  "index_patterns": ["logs-nginx.access-*"],
  "template": {

  },
  "priority": 250,
  "composed_of": [
    "logs-nginx.access@package",
    "global@custom",
    "logs@custom",
    "global-foo@custom",
    "logs-nginx@custom",
    "logs-nginx.access@custom",
    ".fleet_globals-1",
    ".fleet_agent_id_verification-1"
  ],
  "ignore_missing_component_templates": [
    "global@custom",
    "logs@custom",
    "global-foo@custom",
    "logs-nginx@custom",
    "logs-nginx.access@custom",
  ]
}

Users would then be able to manually create new component templates that match the naming convention, then perform a rollover, to customize a group of data streams. We could first support this via documentation, and later add UI features on top of this to make this easier.

Note that implementing this would require that we leverage the package installation format versioning https://github.com/elastic/kibana/issues/121099 to reinstall all index templates on the next stack upgrade.

Some related discussions happening at the moment are for users that want to use their own component template for 3 integrations (not all). With the above, we can already offer a much better solution by only having to update 3 custom templates. What if we offer just 3 options for extension by default, ignoring global. These are our recommended extension paths. But in addition, we would allow users through a Fleet API to add their own component template. Fleet would add it to the end of the list, add it to ignore_missing and manage it. In case of upgrade of a package upgrade, the added custom template would still exists because it was added through Fleet. If it is added manually, it would not stay there.

The above would allow us to have an escape hatch for expert users without breaking our experience. In case of issues, it could be easily detected that a component was added and potentially be removed again.

P1llus commented 1 year ago

I think one thing that users would want, is that they already have custom component templates they want to reuse, not following a specific naming convention, for example let's just call it custom-ilm-component, which includes certain ILM settings they want to override for most (but not all their integrations).

During the time you add the integration to a specific policy, you usually will have an overview over custom ingest pipelines and custom component templates that are being generated for this integration.

It would be super to have a similar UI field to add custom components that already exists, maybe having a drop-down of available custom components (only listing the ones not managed by fleet to prevent large amount of choice?)

ruflin commented 1 year ago

https://github.com/elastic/elasticsearch/pull/92436 just go merged which should create a foundation for this general feature.

felixbarny commented 1 year ago

I've created an Elasticsearch issue for this: https://github.com/elastic/elasticsearch/issues/97664. I'd like to propose closing this issue in favor of the Elasticsearch issue as I think this feature shouldn't be exclusive to Fleet. More on the reasoning about that in the issue.

joshdover commented 1 year ago

We still need to make changes in Fleet to use the new templates component template names if/when they get support in ES. Let's keep this one open.

joshdover commented 10 months ago

Assigning @strawgate while they work on prioritization and discussions with the Elasticsearch team on scoping out a solution.

mbudge commented 9 months ago

Can you put global@custom at the top so we can override @package?

"global@custom", logs-nginx.access@package", "logs@custom", "global-foo@custom", "logs-nginx@custom", "logs-nginx.access@custom", ".fleet_globals-1", ".fleet_agent_id_verification-1

We want to add lowercase normaliser to case sensitive keyword fields users often search, but miss results because they used the wrong case.

One of the main fields in host.name which can be hostname0001 or HOSTNAME0001 or Hostname0001.

We want to apply the lowercase normaliser across all the logs data streams so users don't need to worry about this.

Elastic already has a steep learning curve so it will make the platform easier to use.

herrBez commented 6 months ago

Hi there,

Would it make sense to split the issue and distinguish the "namespace" case (which is more difficult to support) and the global@custom, type@custom, type-dataset@custom case? It would be also more similar to what we do with ingest pipelines starting with release 8.12.

crocswithsocks commented 3 months ago

Very much looking forward to this feature being added to fleet managed index templates. In my use case I need to enable the _size mapping on every index in the cluster. This is a painful task as it requires modifying every integration's @custom component templates to include this setting. Utilizing the global@custom component template would greatly decrease the amount of time spent modifying each integration's component templates!

bgebelek commented 3 months ago

I agree with @crocswithsocks. Having this feature be added will make modifying templates much easier.

mbudge commented 3 months ago

Please put logs@custom/global@custom above the integration @package so we can add the lowercase normaliser to the following fields as a minimum

host.name user.name user.target.name

There's about 20-30 more ecs fields I want to add the lowercase normaliser too.

Our users are continuously missing security logs because it's impossible to find all the hits using KQL when the case is mixed. I know ESQL has a lowercase function, but most users still use KQL in different parts of the platform.

I've added the lowercase processor to the logs@custom ingest pipeline, but the best way to do this is add the lowercase normaliser to the mappings.

nimarezainia commented 1 month ago

https://github.com/elastic/kibana/issues/190730

elastic / kibana

[Fleet] Add support for customizing integration data streams at more levels of granularity #149484

Related