Closed paulb-elastic closed 2 years ago
Looks like Fleet may be keen to allow configuring ILM policies per data stream, based on this comment. https://github.com/elastic/kibana/blob/main/x-pack/plugins/fleet/server/services/epm/packages/_install_package.ts#L124 I also seem to remember having this conversation a while back.
@jen-huang Has moving forward with ILM policies per data stream been discussed at all on the Fleet side?
Hi @dominiqueclarke, it is possible* today to set different policies for different datasets (i.e. browser.network
, http
) by making use of the component templates that Fleet already installs. It is not possible to set them at the namespace level (i.e. http-jenNamespace
). We have a proposal of how to achieve namespace-level policies, which involves creating even more component templates, but that effort is currently deferred.
This issue has more details of what we have today that enables dataset-level customization vs what we would need to achieve namespace-level customization: https://github.com/elastic/kibana/issues/121118
*as soon as https://github.com/elastic/kibana/issues/121184 is fixed
As Jen mentioned, it is possible today to create different ILM policies tied to specific datasets within the integration package spec. Unfortunately, we are blocked on namespace-level customizations as mentioned above.
Draft POC: https://github.com/elastic/integrations/pull/2744
This draft creates separate ILM policies for each data set, browser, browser.screenshot, browser.network, HTTP, ICMP, and TCP. We can move forward with defining a default policy for each dataset once the requirements for that policy are defined by @drewpost. Our users could then customize these default assets if desired. https://github.com/elastic/observability-docs/issues/1578
Sample data stream with segmented ILM policy
@dominiqueclarke - We have the retention period defined by data type requirements already however we didn't go into the depth of hot/cold storage tiers as this was an option that the implementation gave us. Is that storage tier definition all you need (alongside the retention periods) to define OOTB settings?
@drewpost Sorry for the delay. That is correct.
cc: @drewpost @paulb-elastic @andrewvc
Segmenting by data set is possible today in the Integration Package spec. Defaults for each data set can be specified, resulting in the creation of new ILM policies for each data set and component templates for each data set pointing to the specified ILM policies.
Segmenting by namespace is currently in the investigation and definition phase for Fleet, with work expected to begin in a future release. Once implemented, Fleet will generate an additional component template <type>-<dataset>-<namespace>@custom
, to allow user-defined customization per namespace. This feature will build upon the existing feature set allowing for segmenting by data set. More information: https://github.com/elastic/kibana/issues/121118
Defaults per data set can be specified in the Elastic Synthetics Integration package as early as 8.2.0. Establishing defaults per data set will not conflict with the enhancements coming in down the line, as the work will build upon the existing component template hierarchy used to generate index templates. @drewpost to provide the desired defaults for each data stream and data set (HTTP, ICMP, TCP, browser, browser.network, and browser.screenshot). @paulb-elastic to decide when to prioritize this work and whether we can move forward with including defaults as early as 8.2.0.
Synthetics will require the ability to generate namespace-specific component templates and index templates on the fly. Uptime's UI Monitor Management and the Synthetics Service leverages Fleet-based data-stream architecture but saves monitors as saved objects instead of Fleet integration policies. Because monitors are not stored as Fleet integration policies, Fleet will not be notified by default when a user creates a new monitor with a non-default namespace.
To leverage allow Uptime to utilize the namespace segmentation feature, Fleet should expose a method on their plugin contract to generate component and index templates for a given package and namespace. The use case for Synthetics is defined here: https://github.com/elastic/kibana/issues/121118#issuecomment-1066845288
Once exposed, Uptime will need to ensure that proper component and index templates are installed when a new monitor is saved. If the namespace of the monitor is anything but default, Synthetics will invoke the Fleet service to generate the corresponding component and index templates.
@dominiqueclarke thanks for digging up all those answers. It seems to me that we can create a new issue to encapsulate our ultimate plans to create a lifecycle policy for namespaces, and between that issue and https://github.com/elastic/uptime/issues/462 we can close this one out.
Does that sound right?
@andrewvc Yep, @paulb-elastic actually already created an issue off the back of this spike https://github.com/elastic/uptime/issues/462
Thank you @dominiqueclarke for finding out how to proceed. Closing ths as discussed ^^.
All monitors added in Monitor Management, use data streams to write back results to ES. There is a separate data stream for each monitor type (ICMP, HTTP or TCP), with browser monitors being further split down between the type of data we store (network, screenshot etc.).
In addition, the namespace that’s been defined when setting up the monitor (which will be
default
by default), is appended to the name of the data stream.This can be visualised for example, with this set of monitors:
All these have been left on the default namespace of
default
except for theTest Browser in my_namespace
monitor, which has been given a namespace ofmy_namespace
:In Index Management, we can see all the data streams that we use for all of these monitors (all begin with
synthetics-...
):As you can see, there is one for each type of monitor, within each namespace, and browser monitors are further split into
...browser...
,...browser.network...
and...browser.screenshot...
.However, all of these separate data streams, all use the same
synthetics
Index Lifecycle Policy:As a result, every type of monitor, and each category of the browser results, are subject to the same retention period:
This means it’s not possible for users to be able to granularly configure the retention periods based on the type of monitor, or type of data.
For example, a typical use case may be to keep browser result data for 13 months (to allow year-on-year comparison), network data for 3 months, and screenshots for 1 month. This allows the user to balance how much storage they are consuming for these results, based on the value of that data being available.
Spike Expectations
This spike is to investigate if we can automatically configure a separate Index Lifecycle Policy for each of the data streams. It’s reasonable to imagine a 1:1 set up between each data stream to a separate Index Lifecycle Policy, even if they all begin with the same, default configuration. This then allows users to further configure these based on their needs and to control the amount of storage being consumed.
One consideration is that the data stream does not exist until a monitor is created, in a given namespace. So, in the above example, there is no data stream called
synthetics-browser.network-my_namespace
until a browser monitor is created in Monitor Management and saved in themy_namespace
namespace. The first result will begin writing to the newsynthetics-browser.network-my_namespace
data stream, which will be subject to the existingsynthetics
Index Lifecycle Policy.This spike needs to look into how we would be able to create these Index Lifecycle Policies on demand, and if there are any other implications of this.
You could imagine users making use of the
namespace
setting to further configure different data streams (and, by extension, the Index Lifecycle Policies) for monitors that should have different retention periods based on their business value, or a namespace (and associated less valuable monitor results) used to move data through warm/cold/frozen/delete phases quicker.