Open mikecote opened 4 years ago
Pinging @elastic/kibana-stack-services (Team:Stack Services)
Relates to "Make rules sortable, filterable, and aggregatable" in #50222.
This would be good. I've been discussing with users how to be able to report on their rules e.g. in a dashboard, canvas etc which would need to aggregate based on alert parameters. You can just about do something with scripted fields at the moment but it's not ideal.
Also see this issue: Request for alerting internal tags structure. The gist is to add a new "tags" property to alerts, but would be separate from the existing tags
property, which is used by customers. Let's call this new tags property internalTags
. It's meant to be used by alert implementors to add their own tags, for whatever purpose they want, without conflicting with the tags customers use or having the customers see them in the UI.
Would having this be good enough to solve the requirements? I think it depends on how/what you want to search on.
I don't really want to go down the path of adding mappings for alertType-specific data, seems like the migration problem would be ... messy. And not sure what the other options are.
For some other efforts outside of alerting where we use tags both external and internal within a saved object structure we decided on using a leading underscore to designate that the internal tags should remain internal.
tags
_tags
fwiw. That Saved Object has nothing to do with alerting but figured I should mention it.
I've only just began looking into this but I want to summarise what we currently know as it doesn't look like this will be easy to support and I want to make sure we have a good understanding of the context.
Before we talk about the need, lets just describe what we currently have. Backing each alert we create a Saved Object with the following shape:
interface RawAlert extends SavedObjectAttributes {
enabled: boolean;
name: string;
tags: string[];
alertTypeId: string;
consumer: string;
schedule: SavedObjectAttributes;
actions: RawAlertAction[];
params: SavedObjectAttributes;
scheduledTaskId?: string;
createdBy: string | null;
updatedBy: string | null;
createdAt: string;
apiKey: string | null;
apiKeyOwner: string | null;
throttle: string | null;
muteAll: boolean;
mutedInstanceIds: string[];
}
The fields we want to make searchable as part of this issue are params
and action
which both implement the SavedObjectAttributes
type which means these are string
based key-value records which can be deeply nested.
In the face of it querying by these internals objects is straight forward, but as this Saved Object needs to support all types of alerts and actions, we have a challenge as each alert type can have different shapes to these fields. To support these multiple shapes we tell Elasticsearch not to create a mapping for these objects - which means that querying by their shape isn't actually possible.
That said, we'd like to be able to query for specific Alerts based values that are stored in these fields, so that we can:
So, the reason we can't currently support querying against these fields is clear, but there are a few approaches we could take to make these requirements possible- none of which are straight forward, so some discussion is needed to understand the cost-value ratio.
One assumption that I'm making for all of these is that we want to rely on ES for this and not do any of these operations in memory as that would make it inefficient and hard to support pagination.
This approach would mean that whenever a new AlertType is created we generate a brand new type of SavedObject with its own mapping.
There are a few challenges with this approach:
There's also a clear limitation to this approach:
This will still not allow us to query based on the action
field as these would still be different shapes across the different SavedObject types and would have top remain with a disabled mapping.
@mikecote has already told me that there is a danger here of a mapping explosion which I need to investigate further.
Instead of splitting the SavedObject types between the AlertTypes, we can take an approach similar to what SavedObjects itself does.
While we would have one single SavedObject type of alert
(as we do now) we can store each alert type's params
and each action type's config
under a corresponding key based on their unique ID and keep the mapping set to dynamic
so that we can query against the internal shapes.
For example, this would mean that given an alert of type "example.always-firing" with an action of type ".index" you would store the data like so:
{
"alert" : {
"consumer" : "alerting",
"alertTypeId" : "example.always-firing",
"params" : {
"example.always-firing" : {
"numOfInstances" : 5
},
},
"actions" : [
{
"actionTypeId" : ".index",
"params" : {
".index" : {
"documents" : [
{
"val" : "{{alertInstanceId}}"
}
]
},
},
"actionRef" : "action_0",
"group" : "default"
}
],
/// ... other fields
},
"type" : "alert",
"references" : [
{
"name" : "action_0",
"id" : "9aab14cd-87e1-43ca-98f3-1caa9723ce98",
"type" : "action"
}
],
"updated_at" : "2020-05-11T14:48:15.017Z"
}
Instead of what we currently do which is this:
{
"alert" : {
"consumer" : "alerting",
"alertTypeId" : "example.always-firing",
"params" : {
"numOfInstances" : 5
},
"actions" : [
{
"actionTypeId" : ".index",
"params" : {
"documents" : [
{
"val" : "{{alertInstanceId}}"
}
]
},
/// ... other fields
}
],
/// ... other fields
},
"type" : "alert",
"references" : [
{
"name" : "action_0",
"id" : "9aab14cd-87e1-43ca-98f3-1caa9723ce98",
"type" : "action"
}
],
"updated_at" : "2020-05-11T14:48:15.017Z"
}
You may note how we have the addition of the "example.always-firing" and ".index" keys as appropriate under the params
fields which would allow us to enable Dynamic Mapping in these fields as their shape will no longer slash across types.
But this too has challenges:
params
that they do not want to create mappings for as they might change often (such as the document
field in the .index
action which will be different on every single instance!Another option is to standardise the shape of params across all AlertTypes such that each AlertType will specify the exact shape and types of their params and we'll merge these shapes together into one static shape which will be used to define the mappings of these params
.
This is the simplest solution in terms of the mapping, but introduces a whole set of challenges in the framework:
All of the above options will require changes in the SavedObjectsClient as you can only sort/filter by a rood field at the moment, and supporting "deep" fields inside of "objects" isn't currently supported and doing so will require some further research.
As you can see, none of these options are straight forward and there isn't a clear winner.
This all requires a lot more investigation, and playing around with the code locally I think that the second option (Enable dynamic mapping + create a deep object) produces the most maintainable option, but it has potential issues that still need investigation which is what I'll likely be looking into next.
If anyone has thoughts or concerns on these options (or perhaps a 4th option we can investigate) I'm all ears. :)
I'm worried about any solution that introduces new mappings for alertType-specific data, due to all the challenges pointed out ^^^.
It's not completely clear that we need an ES solution here; from the SIEM issue https://github.com/elastic/kibana/issues/50222:
Either that or a plain API (even if slower like a table scan) to abstract us away so we can natively to the actions/alerting objects would make it to where we don't have write our own hand rolled solutions.
I read "table scan" as "do as much of a query as you can with ES, then do the remaining filtering/mapping/aggs on the results in JS". That will work if the number of alerts is "reasonable", which I don't know if it is.
I think we should also find out if having an "internal tags" via https://github.com/elastic/kibana/issues/58417 would be good enough for now. This would presumably be a parallel of the current tags structure, but not editable (or probably viewable) in the UI, only programmatically. Not nearly the same as ES fields, but may be useful enough for common needs.
There was also mention of scripted fields in the discussion above, but I'm not sure how we might use them.
@FrankHassanabad Would being able to query/filter/sort by a set of internal tags be enough for you? It looks like making the params/config actually searchable would a significant piece of work that we'd need to be cautious before picking up.
FYI, scripted fields seems to work and maybe at least better than doing it in js.
On Wed, May 13, 2020 at 1:49 PM Gidi Meir Morris notifications@github.com wrote:
@FrankHassanabad https://github.com/FrankHassanabad Would being able to query/filter/sort by a set of internal tags be enough for you? It looks like making the params/config actually searchable would a significant piece of work that we'd need to be cautious before picking up.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/elastic/kibana/issues/50213#issuecomment-627960745, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABYDJMPTGPFNH7JXRPMRVC3RRKJOXANCNFSM4JL2VCDQ .
--
Matthew Adams
{
“title”: "Senior Solution Architect”,
“location”: “5 Southampton St.Covent Garden, London WC2E 7HA”,
“url”: “elastic.co”,
}
Search. Observe. Protect.
FYI, scripted fields seems to work and maybe at least better than doing it in js.
Sorry but could you be more specific? Given the limitations we have on making these fields searchable, how would you go about using scripted fields to achieve what SIEM are looking for?
table scan
I lifted that term from relation DB's where the definition is:
(also known as a sequential scan) is a scan made on a database where each row of the table is read in a sequential (serial) order and the columns encountered are checked for the validity of a condition. Full table scans are usually the slowest method of scanning a table due to the heavy amount of I/O reads required from the disk which consists of multiple seeks as well as costly disk to memory transfers.[1]
[1] Ref: https://en.wikipedia.org/wiki/Full_table_scan
Don't know if there is a ES adapted term but basically anytime I have to read all of the Saved Objects into memory either through buffering or streaming I count that as a table scan and since it's done through network calls from ES -> Kibana it is adds to the expense.
"do as much of a query as you can with ES, then do the remaining filtering/mapping/aggs on the results in JS"
Yeah that's our ultimate goal. We are hoping to avoid any operation that causes us to iterate over all the alerts in memory from ES -> Kibana due to:
Would being able to query/filter/sort by a set of internal tags be enough for you?
Can we get aggs as well with arrays in Elastic Search? The use case is that sometimes we need to do unique counts of items and display them. If this PR is merged we might have it? https://github.com/elastic/kibana/pull/64002
If we were able to query/filter/sort/aggs I think that covers all of our use cases to avoiding table-like scans.
As an aside:
If you still are going to allow mapping I would suggest combining two parts of your approaches above.
This part:
While we would have one single SavedObject type of alert (as we do now) we can store each alert type's params and each action type's config under a corresponding key based on their unique ID and keep the mapping set to dynamic so that we can query against the internal shapes.
and this part:
Another option is to standardise the shape of params across all AlertTypes such that each AlertType will specify the exact shape and types of their params and we'll merge these shapes together into one static shape which will be used to define the mappings of these params.
So that it becomes this:
While we would have one single SavedObject type of alert (as we do now) we can store each alert type's params and each action type's config under a corresponding key based on their unique ID. Each AlertType will specify the exact shape and types of their params and we'll merge these shapes together into one static shape which will be used to define the mappings of these params.
Then you have exact mappings and avoid mapping explosions and conflicts and the Saved Objects migration system takes over when we update our mappings. Then we are responsible as a team for migrating our mappings altogether.
For what is worth also ... On the community forums and community slack, users are opening up their .kibana
saved objects by granting permissions to it and then creating dashboards our rules which includes the params:
https://elasticstack.slack.com/archives/CNRTGB9A4/p1589221644238900
I know that might not be exactly what we may want this quickly but it is something we have to keep in mind that users are granting each other privileges to their saved objects index so they can write their own dashboards against static SIEM rules.
Since things are "beta" I think they would be ok with updates but might become frustrated if their dashboards are no longer possible. On the flip side, if this brings more features they couldn't have before such as sorting/filtering/querying/aggs then they will be very delighted even if we have to advise them on how to update there existing dashboards.
This might be an unsupported or discouraged thing that users are opening up saved object indexes? However I want to point it out as it's already happened.
FYI, scripted fields seems to work and maybe at least better than doing it in js.
Sorry but could you be more specific? Given the limitations we have on making these fields searchable, how would you go about using scripted fields to achieve what SIEM are looking for?
If, for example, you create a filtered alias to just pick the rules from .kibana then add an index pattern in kibana with this scripted field using painless
def mitre = []; for (value in params['_source']['alert']['params']['threat']) {mitre.add(value['tactic']['id'])} return mitre
You can then aggregate on the tactic ID for reporting. Matthew
Are sorting and aggregations absolutely necessary? If not, https://github.com/elastic/elasticsearch/issues/33003 is a potential solution worth evaluating.
New possible solution mentioned in option 4 of #67290. That issue explores options to solve Elasticsearch merging objects on update when the mapping has enabled: false.
This may not be a good approach but worth mentioning. Probably doesn't solve sorting or aggregations which would make https://github.com/elastic/elasticsearch/issues/33003 more a solution worth evaluating as @kobelb mentioned.
Change object structure to array
I noticed the alert's
actions[x].params
attribute doesn't have this problem. There is a possibility that storing alert params and action configs into an array structure would solve this problem and possibly also make them searchable.The params value structure could be something like the following:
[ { "name": "index", "value": "foo" }, { "name": "timeField", "value": "@timestamp" } ]
This would allow a consistent mapping of
name
andvalue
where value can be enabled: false but won't be impacted by this issue (due to being within an array). This would also require a saved object migration.
The step further that would be required to make the values searchable would be do split the values into different mapped fields. This would require orchestration between field name and value field.
To query this data, it would look something like this:
Are sorting and aggregations absolutely necessary?
Yes. We have params such as risk_score
and a lot of users internal and external to our organization are asking why we cannot sort our tables or queries based on these important items.
This would be awesome thing for uptime alerts as well, we will be able to determine which alerts are enabled or not.
we will be able to determine which alerts are enabled or not.
You can filter on that today, no? Using KQL
we will be able to determine which alerts are enabled or not.
You can filter on that today, no? Using KQL
@mikecote first we need to determine how can we find an alert via params. if params aren't searchable, we cant find an alert. What are the alternative ways to find an alert, can we add custom tag to an alert while creating? or maybe specify an alert name in the flyout, and dont let user change it.
@mikecote @gmmorris any update on this? Any plans for 7.10 or 7.11 for this?
@shahzad31 we haven't agreed on a path forward for this yet so we don't have plans to work on this yet. We are planning in the meantime to add support for internal tags in 7.11 (https://github.com/elastic/kibana/issues/58417).
@mikecote we are beginning to get a lot of feedback from internal Elastic rule creators as well as external users that our 200+ prebuilt detection engine rules is painful to sort and find in the Rules page. As we are expecting to continue to ship more prebuilt rules in each release, I'd like to discuss the possibility of putting this issue on the 7.11 priority list of the Kibana Alerting team.
cc: @arisonl @peluja1012 @spong @MikePaquette
@dontcallmesherryli We put it in our 7.12 tentative plan for now. We will keep the implementation discussion going until then to have an idea what approach to take.
This came up again when I talked to the o11y folks this week. I keep hoping that our "system tags" issue may end up solving this. @sqren noted that this would then end up being painful trying to keep params AND tags up-to-date - it's a least more boilerplate-y code that alerts clients would have to manage.
One thought on that would be to provide some kind of "helper" function instead of code, which given a params object would return a set of "field tags" or such. Here's an example for index threshold alert type, imagining you might want to eventually be able to search on the index used in the alert:
function fieldTags(params: IndexThresholdParams) {
return {
'index': params.index
}
}
We would call this function (defined in the alert type, but it's optional of course), and then add these to the "system" tags with the resulting data.
Another issue with this is that my thoughts on the "system tags" is that they certainly would not be updateble via HTTP, only the in-process alert client, and it's possible we may not want these readable either. But I think we'd want these "field tags" to be readable/searchable via http, so we can make use of these in Kibana.
cc @arisonl
We should do an investigation to see if https://www.elastic.co/guide/en/elasticsearch/reference/7.x/flattened.html can be used as a solution. Adding the research label.
Another option worth investigating is runtime fields https://www.elastic.co/guide/en/elasticsearch/reference/master/runtime.html.
We should do an investigation to see if https://www.elastic.co/guide/en/elasticsearch/reference/7.x/flattened.html can be used as a solution. Adding the research label.
I did a small spike on this (see: https://github.com/elastic/kibana/compare/master...mikecote:alerting/searchable-params-poc) and it seems using the flattened type could solve most of the problems with a bit of work on the validation code. The first limitation I see is, by design, treating every value as a string which brings limitations to comparators (and sorting) on numeric values.
In the spike, I can call the alerting APIs with the following:
// KQL filter for a value
curl -XGET "https://elastic:changeme@localhost:5601/rgp/api/alerts/_find?filter=alert.attributes.params.timeWindowSize:100" -k
// Sort where it uses string comparators (meaning 99 > 100)
curl -XGET "https://elastic:changeme@localhost:5601/rgp/api/alerts/_find?sort_field=params.timeWindowSize&sort_order=asc" -k
Nice work @mikecote , we should make a clear list of what flattened can support and what it can't, then validate with some teams (SIEM, O11y...) that they can live without the things it can't support.
Agree, this is great!
The first limitation I see is, by design, treating every value as a string which brings limitations to comparators (and sorting) on numeric values.
Having a simple string-based key-value pair lookup mechanism will solve all of our current use cases and 99% of our future use cases. So that limitation is definitely tolerable.
we should make a clear list of what flattened can support and what it can't, then validate with some teams (SIEM, O11y...) that they can live without the things it can't support.
The two teams I recall requesting this is APM and Security Detections team.
Looks like @sqren is happy from the APM side and I'm curious what @spong on the detections team thinks of the limitations?
The limitations are documented here: https://www.elastic.co/guide/en/elasticsearch/reference/master/flattened.html#supported-operations.
Supported operations
Because of the similarities in the way values are indexed, flattened fields share much of the same mapping and search functionality as keyword fields.
Currently, flattened object fields can be used with the following query types:
term
,terms
, andterms_set
prefix
range
match
andmulti_match
query_string
andsimple_query_string
exists
When querying, it is not possible to refer to field keys using wildcards, as in
{ "term": {"labels.time*": 1541457010}}
. Note that all queries, includingrange
, treat the values as string keywords. Highlighting is not supported onflattened
fields.It is possible to sort on an flattened object field, as well as perform simple keyword-style aggregations such as
terms
. As with queries, there is no special support for numerics — all values in the JSON object are treated as keywords. When sorting, this implies that values are compared lexicographically.Flattened object fields currently cannot be stored. It is not possible to specify the store parameter in the mapping.
@spong @sqren as I'm working on this, one thing worth mentioning is the params would be treated like string keyword
. This means that doing partial searches, etc won't work using the flattened approach. Is this still an acceptable solution for Security and Observability? Based on @sqren's comment, it seems like string based key-value lookup will work for Observability.
I'll be working on a PR here but wanted to double check before going further.
I'm renaming the issue from Make alert params and action config searchable
to Make alert params searchable
since the actions plugin doesn't have a find API (only get all API) at this time.
one thing worth mentioning is the params would be treated like string keyword
sgtm 👍
@spong @sqren as I'm working on this, one thing worth mentioning is the params would be treated like string keyword. This means that doing partial searches, etc won't work using the flattened approach. Is this still an acceptable solution for Security and Observability?
So I think this is going to be a little problematic on the Security side with the fields we're looking to expose via search within Rules Management. For instance, one of the most requested searchable fields is MITRE ATT&CK Threats/Techniques
, so for the below rule, users would expect to be able to search just for T1555
or Password Stores
for example, and get back any rule with the below technique:
Here's our alertParams, of which I think the following will be desired to be searched against, and would be problematic with not being able to search for partials (or case insensitivity):
netcat process
or data exfiltration
I see runtime fields referenced in a comment above, but guessing we haven't looked at it in a lot of detail. And I imagine it gets complicated, pretty quick. Like we'd actually need mapping definitions for the fields, which we don't have today. Somehow augmenting the KQL-generated query to add the runtime bits. Etc.
Hey @mikecote and alerting team! Thank you for taking over this problem.
I recently started looking at what we (Security Solution, Detections folks) can do to improve UX of the detection rules management table.
Sorry for somewhat delayed comment from our side, it took a while to collect some requirements and check the proposed solution using the flattened field type. Here's our findings and observations.
We now have 461 built-in rules in the solution, we expect this number to grow probably by a few hundred per each release, plus there are custom rules, so we'd like to provide our users the most flexible filtering, searching and sorting experience, so they could quickly find and enable/disable/bulk edit only relevant rules.
We'd like to be able to build the following improvements for the rules management table:
Simple dedicated filters for the most important fields, for example:
Simple sorting for columns in the rules management table - Ticket 1, Ticket 2
_score
Full text search
Filters and search over hierarchical structures like MITRE ATT&CK Threats/Techniques - Comment
Filtering with free-form KQL query - Ticket
Bulk edits for all rules, or all the rules matching the current filter
We're planning to start small and build simple filters first (severity etc), but eventually I think we need all of those features since it would improve UX for our users quite substantially.
I created a new detection rule and specified some of the typical fields. Just to be on the same page, let's check the data that has been indexed into .kibana
, and mappings for this data.
So the proposed implementation using the flattened field type boils down to the following, right? Please correct me if I misunderstood this part.
change this:
{
"mappings": {
"dynamic" : "strict",
"properties": {
"alert" : {
"properties" : {
"params" : {
"type" : "object",
"enabled" : false
}
}
}
}
}
}
to this:
{
"mappings": {
"dynamic" : "strict",
"properties": {
"alert" : {
"properties" : {
"params" : {
"type" : "flattened"
}
}
}
}
}
}
I've checked various scenarious relevant to our needs, compared behaviour of flattened field with normal fields with specified mappings. Here are my findings.
TL;DR: I don't think flattened field type would provide us the ability to build the UX we're planning to build.
Although these features are supported:
See my tests below.
Here I'm not using flattened field, just outlining some a la real-world rule data and a la real-world queries we'd like to run against it as an example.
# Don't forget to
DELETE delme-test-rule-filtering-*
Are there other feasible solutions for filtering/sorting/searching?
Previously, we were considering loading all the rules in the browser app, and implementing custom filtering, searching and sorting in JS in memory. Not an ideal solution to say the least.
Would it be feasible to store rule parameters in a separate, dedicated index with mappings specific to detection rules?
We already have some objects related to detection rules that are stored in separate indices:
# Detection rule statuses
GET .kibana/_search
{
"query": {
"bool": {
"filter": [
{
"match": {
"type": "siem-detection-engine-rule-status"
}
}
]
}
}
}
# Detection rule actions
GET .kibana/_search
{
"query": {
"bool": {
"filter": [
{
"match": {
"type": "siem-detection-engine-rule-actions"
}
}
]
}
}
}
@banderror, thank you for the very detailed analysis 💯
I've done a few POCs with the requirements of searching, sorting and filtering on any type of field (string, number, etc.) and below are my findings.
flattened
type 👎We cannot use the flattened
type because it analyzes the data as keyword
, which doesn't support searching nor sorting on numeric values. I've removed this as an option.
We cannot use runtime fields because it only supports the following data types: boolean
, date
, double
, geo_point
, IP
, keyword
and long
. It doesn't support text
, which prevents searching. Runtime fields also don't work if search.allow_expensive_queries
is disabled. I've removed this as an option.
I'm going to explore creating a deep object with static mapping similar to @gmmorris' Enable dynamic mapping + create a deep object
option here. If a prototype works out well, alert types could opt-in to using a static mapping for some / all of their parameters.
A couple of other ideas:
re-post (again, sorry) of "internal tags" - https://github.com/elastic/kibana/issues/58417
The idea of this would be that we add a new field internal_tags
which would be a nested array of {key: string, text: string, value: number}
. We'd do the usual keyword/text thing with the text
field (to support both keyword and text searches); to handle numeric values, there's also a value
field.
This won't be useful in a lot of places in Kibana, since nested fields are supported everywhere (eg, various visualizations), but we'd generally just be using it internally, as part of the search that we end up building. I believe KQL does not support nested fields either, so that may be a big downside. Or complication, if we have to hack something into the existing SO find APIs.
Solutions using alerting could then add whatever internal tags they want. And presumably make use of search capabilities in the alertsClient, that would support this. But we likely wouldn't want to expose these tags to end users, at least for search (just because we'd have to document what values were being stored there), and certainly not to update. We could support adding these on create and update from the alertsClient though.
create a parallel search-possible version of the params
This is kinda similar to the "internal tags", and would use the same sort of "nested" field of key/text/number. But in this case, we "flatten" all the params into single strings - {a: {b: 1}}
becomes a.b: 1
- and then create a nested element for each key/value pair.
Again, we probably wouldn't end up exposing this field as something customers could use as a search target, at least for now - and they probably can't anyway due to lack of KQL support for nested fields.
I've been working on a POC that creates a deep object with static mappings. I made the index threshold alert opt-in to use static mappings instead of storing its parameters in the enabled: false
mapping.
It seems to work well with a couple of things worth noting so far:
.
in them. We would have to convert .
to something else (ex: siem.signals
to siem_signals
). Also, add protection to prevent overlapping names..kibana
index. I spoke with @rudolf, and it seems ok to bump the limit for now.enabled: false
.@elastic/kibana-alerting-services there is a lot of go through on the above. I am thinking a synchronous demo/discussion may be necessary for this topic?
@pmuellr yeah, I think we'll have to keep those ideas on the table as well for sure.
This approach in Mike's POC ^^^ seems workable. Is making this an opt-in for alert types something we want to leave in long-term? Or will we make it mandatory at some point. Thinking it would be nice to not have two paths, at least long-term.
This approach in Mike's POC ^^^ seems workable. Is making this an opt-in for alert types something we want to leave in long-term? Or will we make it mandatory at some point. Thinking it would be nice to not have two paths, at least long-term.
I think if we go opt-in for now, making it mandatory long-term would be great. Similar to validation as it is opt-in right now. Maybe 8.0 if we go this path?
I've also worked on a POC that splits the alert
saved object type by alert type. This is the same as the "Create a Saved Object type for each AlertType" approach that @gmmorris mentioned. This approach allows having full parameter mappings for each alert type.
Here are the good things I discovered during the POC:
rule_type_index_threshold.attributes.params.index:foo
searchFields
and search
from the find API to do the searching.alert
to rule
already, so splitting by alert type could be done simultaneously.Here are the bad things I discovered during the POC:
alertTypeId
in every API call. Knowing some new APIs are coming (rename alert -> rule), we have the opportunity to require this alertTypeId
to be part of the URL as well.I want to explore creating a separate saved object for the params next and see how that goes. There are still other options outside of these, but it is good to know each's pros/cons to make sure we're making the right investment for the long term.
I want to explore creating a separate saved object for the params next and see how that goes.
I took a quick look at this approach. It would rely on using the join field type, and has_child queries. Since the requirements are to use the same field across both saved object types, it doesn't seem like a good approach since saved object attributes are stored in different fields in Elasticsearch by design. There may be alternatives in this approach, but they don't seem worth investigating to me at this time.
After looking at all the options that are on the table, I want to propose going with the "deep object with static mappings" approach mentioned here.
This approach checks the boxes for filtering, searching, and sorting by doing something like the following on my POC:
I have raised a few down points, but most of them are not an issue if I think long-term. Below is how I see each being resolved:
- This approach would require pre-save / post-read logic in the alerts client to nest/unnest the parameters.
This would be a good time to split the data access logic out of the alerts client and create a separate file. This is where the pre-save and post-read logic can live, and it would also keep the alerts client smaller as more code moves over.
- Using the alert type id as the sub-property name, it can't have
.
in them. We would have to convert.
to something else (ex:siem.signals
tosiem_signals
). Also, add protection to prevent overlapping names.
However we decide to map these parameters, we will have to distinguish them by alert type id. Hence, I think long term, we may want to prevent using .
in the ids but in the meantime, come up with a common function that can convert an alertTypeId to something consumable by Elasticsearch.
- We may have to bump up the field mappings limit in the
.kibana
index. I spoke with @rudolf, and it seems ok to bump the limit for now.
This approach is not as bad as the saved object type per alert type as only the parameters are added to the mappings and only the alert types that opt-in for now. In the future, and maybe as we rename alert
saved object type to rule
, we can move them into their own .kibana_rules
index.
- A migration would be required to opt-in an alert type to have their parameters searchable.
This problem is short-term until we would make all the alert types require mappings for their parameters. I wouldn't make a design decision based on this. If we allow alert types to define migrations (https://github.com/elastic/kibana/issues/50216), a simple no-op migration would move their params from the default enabled: false
spot to their mapped version.
- This could create migration errors at the mappings level. Alert types would have to make sure their mappings work for what they allow storing in parameters.
This problem will exist for any approach that requires mapping. There is a high likelihood that errors here will happen, and we should work with the core team to see what we can do to ensure Kibana can still start when this migration fails. Alert types that use config-schema
are already one step ahead at making sure the data equals the mappings, and we can make the validation mandatory for all alert types first.
- Alert types that don't opt-in for custom mappings get their parameters stored in a generic variable mapped
enabled: false
.
This problem is short-term as well. I think long term, we should force all alert types to define their mappings. Here as well, I wouldn't make a design decision based on this.
- Searching, sorting and filtering these parameters would require API users to understand the underlying structure that these parameters are stored. Changing this down the line could break their requests. There may be a way to create some limitations here and handle the underlying structure ourselves.
Here we could do something like the saved objects API does where it requires attributes
in the filter but then replaces it with the saved object type. We could do something like that for params.value
instead of params.{alertTypeId}.value
. But this may prevent filtering across different alert types. But we could also expand it into a series of OR
statements.
- The saved objects service would need to support dynamic mappings (see: https://github.com/elastic/kibana/issues/88988). In my POC, I have an approach the core team is happy with.
I have something worked out with the core team that would solve this problem and seems in line with what other teams ask for. This approach will also work for us to support migrations per alert type.
I will set up a design meeting to ensure the @elastic/kibana-alerting-services stands by this proposal before starting implementation. After, I will circle back with the Security and Observability teams who requested this to confirm the approach solves what they're asking for.
I spoke with @romseygeek from the ES Search team and came to the same conclusion of my proposal where having a type associated with each param so we can build mappings is the best way forward for us. 👍
After a design discussion and further iterations on a POC, @ymao1 and I have something ready for approval. The underlying fundamentals have changed. If you're interested, feel free to read on. Otherwise, @spong @sqren, we'll be in touch soon to get your 👍 before making a PR to master.
@elastic/kibana-alerting-services it was hard to find time to do a follow-up session, so I'm opting for async to keep momentum on this issue. The details are below.
params
, where it now uses dynamic: false
to allow Elasticsearch to store the remaining fields.ignore_malformed: true
to all param mappings to prevent write failures if there are mismatches from other alert types that don't use searchable params."params": {
"dynamic": false,
"type": "object",
"properties": {
// From .index-threshold
"aggType": {
"type": "keyword"
},
// From siem.rules
"riskScore": {
"type": "number",
"ignore_malformed": true
}
}
},
The approach we have settled on is Static mapping + flattened object
from @gmmorris proposal here: https://github.com/elastic/kibana/issues/50213#issuecomment-626775254. We've done many iterations to make alert parameters searchable, sortable and filterable across different alert types. We felt tying our underlying data structure to the user's KQL filters didn't seem right. It also didn't work well to sort/search/filter across alert types.
There are some decent good sides going with this approach:
params.{something}
.
in their idRelatively, there are not as many downsides going with this approach:
.kibana
ignore_malformed
everywhere in the mappings to prevent failures on save if ever there is a mismatch with any alertWhat do we do what two AlertTypes use the same name for a field? We will throw an error if the mappings differ. If they are the same, it would be ok. This allows us to move forward while having this limitation for now.
How do we handle migrations within a specific AlertType? And what about across all types? Migrations will behave the same as we have now since the data structure didn't change. Migration by alert type coming soon.
What happens if the shape is wrong? Do we validate ourselves? Rely on plugins?
The usage of ignore_malformed: true
and dynamic: false
will protect us in this case. We will also limit the number of fields each alert type can map so we're not merging full schemas.
If this means dynamically merging mappings and types on the "way into the framework" and then exposing it as "portions" of this type on the "way out of the framework" back to the solution - are we introducing a lot of complexity that will be hard to maintain in the future? Since the data structure didn't change, things going in and out will remain the same. Though only a portion of the fields will be mapped and searchable.
In the future, it seems best to create an Elasticsearch index per alert type. This will allow cross-index queries, sorting, and filtering properly while letting Elasticsearch handle issues when mappings differ between indices at search time. The queries would be done the same way as the approach indicated above, and it feels like the approach we should do when writing alerts as data (index per alert type). This approach brings an implementation similar to where we want to be while also addressing the problem for alerts as data.
I had a quick chat with @kobelb about the approach, one protection worth investigating is when an alert type doesn't use searchable params but ends up being searchable because another alert type with similar fields that defined the mappings. It could be worth adding some protections to the _find API.
What do we do what two AlertTypes use the same name for a field? We will throw an error if the mappings differ. If they are the same, it would be ok. This allows us to move forward while having this limitation for now.
Are we catching this during a build step (CI) or is there a risk that these issues will go undiscovered until end users notify us?
In the future, it seems best to create an Elasticsearch index per alert type
+1
@sqren it would be done during a build step to notify ASAP when something is wrong.
From https://github.com/elastic/kibana/issues/50222:
The main focus of this issue is making alert params (more than action config) searchable, sortable and filterable if there's extra work necessary to support this in actions, we can create a follow up issue.