elastic / kibana

Your window into the Elastic Stack
https://www.elastic.co/products/kibana
Other
19.48k stars 8.05k forks source link

Make alert params searchable #50213

Open mikecote opened 4 years ago

mikecote commented 4 years ago

From https://github.com/elastic/kibana/issues/50222:

Use Case: As a rule user, I need to sort, filter, and sometimes aggregate on rule types. For example, I need to sort my rule types on severity, or I need to filter them by severity.

Technical solution: The current alerting/actions does not allow mapping down to the level of alerting/actions parameters. Therefore we cannot use the saved objects API of kql mixed with "order by". If that were changed and we were allowed mapping abilities to the alerting/actions parameters that would solve this. Either that or a plain API (even if slower like a table scan) to abstract us away so we can natively to the actions/alerting objects would make it to where we don't have write our own hand rolled solutions.

The main focus of this issue is making alert params (more than action config) searchable, sortable and filterable if there's extra work necessary to support this in actions, we can create a follow up issue.

elasticmachine commented 4 years ago

Pinging @elastic/kibana-stack-services (Team:Stack Services)

mikecote commented 4 years ago

Relates to "Make rules sortable, filterable, and aggregatable" in #50222.

m-adams commented 4 years ago

This would be good. I've been discussing with users how to be able to report on their rules e.g. in a dashboard, canvas etc which would need to aggregate based on alert parameters. You can just about do something with scripted fields at the moment but it's not ideal.

pmuellr commented 4 years ago

Also see this issue: Request for alerting internal tags structure. The gist is to add a new "tags" property to alerts, but would be separate from the existing tags property, which is used by customers. Let's call this new tags property internalTags. It's meant to be used by alert implementors to add their own tags, for whatever purpose they want, without conflicting with the tags customers use or having the customers see them in the UI.

Would having this be good enough to solve the requirements? I think it depends on how/what you want to search on.

I don't really want to go down the path of adding mappings for alertType-specific data, seems like the migration problem would be ... messy. And not sure what the other options are.

FrankHassanabad commented 4 years ago

For some other efforts outside of alerting where we use tags both external and internal within a saved object structure we decided on using a leading underscore to designate that the internal tags should remain internal.

tags
_tags

fwiw. That Saved Object has nothing to do with alerting but figured I should mention it.

gmmorris commented 4 years ago

I've only just began looking into this but I want to summarise what we currently know as it doesn't look like this will be easy to support and I want to make sure we have a good understanding of the context.

Current State

Before we talk about the need, lets just describe what we currently have. Backing each alert we create a Saved Object with the following shape:

interface RawAlert extends SavedObjectAttributes {
  enabled: boolean;
  name: string;
  tags: string[];
  alertTypeId: string;
  consumer: string;
  schedule: SavedObjectAttributes;
  actions: RawAlertAction[];
  params: SavedObjectAttributes;
  scheduledTaskId?: string;
  createdBy: string | null;
  updatedBy: string | null;
  createdAt: string;
  apiKey: string | null;
  apiKeyOwner: string | null;
  throttle: string | null;
  muteAll: boolean;
  mutedInstanceIds: string[];
}

The fields we want to make searchable as part of this issue are params and action which both implement the SavedObjectAttributes type which means these are string based key-value records which can be deeply nested.

In the face of it querying by these internals objects is straight forward, but as this Saved Object needs to support all types of alerts and actions, we have a challenge as each alert type can have different shapes to these fields. To support these multiple shapes we tell Elasticsearch not to create a mapping for these objects - which means that querying by their shape isn't actually possible.

The Need

That said, we'd like to be able to query for specific Alerts based values that are stored in these fields, so that we can:

  1. Sort by them
  2. Filter by them
  3. Aggregate over alerts using these fields in some manner

Possible Solutions

So, the reason we can't currently support querying against these fields is clear, but there are a few approaches we could take to make these requirements possible- none of which are straight forward, so some discussion is needed to understand the cost-value ratio.

One assumption that I'm making for all of these is that we want to rely on ES for this and not do any of these operations in memory as that would make it inefficient and hard to support pagination.

Create a Saved Object type for each AlertType

This approach would mean that whenever a new AlertType is created we generate a brand new type of SavedObject with its own mapping.

There are a few challenges with this approach:

  1. We would have to handle an unknown number of SavedObjects types, all of whom need to be administered via the Alerting Framework.
  2. We will likely have to make changes to the SavedObjects Client APIs as they currently assume that you're only ever operating at the level of a single SO type, not multiple types.

There's also a clear limitation to this approach: This will still not allow us to query based on the action field as these would still be different shapes across the different SavedObject types and would have top remain with a disabled mapping.

@mikecote has already told me that there is a danger here of a mapping explosion which I need to investigate further.

Enable dynamic mapping + create a deep object

Instead of splitting the SavedObject types between the AlertTypes, we can take an approach similar to what SavedObjects itself does. While we would have one single SavedObject type of alert (as we do now) we can store each alert type's params and each action type's config under a corresponding key based on their unique ID and keep the mapping set to dynamic so that we can query against the internal shapes.

For example, this would mean that given an alert of type "example.always-firing" with an action of type ".index" you would store the data like so:

{
          "alert" : {
            "consumer" : "alerting",
            "alertTypeId" : "example.always-firing",
            "params" : { 
                "example.always-firing" : { 
                  "numOfInstances" : 5
               },
            },
            "actions" : [
              {
                "actionTypeId" : ".index",
                "params" : {
                    ".index" : {
                      "documents" : [
                        {
                          "val" : "{{alertInstanceId}}"
                        }
                      ]
                    },
                },
                "actionRef" : "action_0",
                "group" : "default"
              }
            ],
            /// ... other fields
          },
          "type" : "alert",
          "references" : [
            {
              "name" : "action_0",
              "id" : "9aab14cd-87e1-43ca-98f3-1caa9723ce98",
              "type" : "action"
            }
          ],
          "updated_at" : "2020-05-11T14:48:15.017Z"
        }

Instead of what we currently do which is this:

{
          "alert" : {
            "consumer" : "alerting",
            "alertTypeId" : "example.always-firing",
            "params" : { 
              "numOfInstances" : 5
            },
            "actions" : [
              {
                "actionTypeId" : ".index",
                "params" : {
                  "documents" : [
                    {
                      "val" : "{{alertInstanceId}}"
                    }
                  ]
                },
                /// ... other fields
              }
            ],
            /// ... other fields
          },
          "type" : "alert",
          "references" : [
            {
              "name" : "action_0",
              "id" : "9aab14cd-87e1-43ca-98f3-1caa9723ce98",
              "type" : "action"
            }
          ],
          "updated_at" : "2020-05-11T14:48:15.017Z"
        }

You may note how we have the addition of the "example.always-firing" and ".index" keys as appropriate under the params fields which would allow us to enable Dynamic Mapping in these fields as their shape will no longer slash across types.

But this too has challenges:

  1. Dynamic mapping uses the first type it encounters, so supporting fields that might change type between instances will be very tricky and might require AlertTypes to preemptively provide us with mappings of their own at setup time which we would then need to manually merge.
  2. It's not yet clear how migrations might work in such a model and needs further investigation.
  3. Each AlertType will likely have to provide us with some mappings constraints of their own though, as they might have some fields in their params that they do not want to create mappings for as they might change often (such as the document field in the .index action which will be different on every single instance!
  4. Here too there's a danger of a mappings explosion that needs to be investigated.

Static mapping + flattened object

Another option is to standardise the shape of params across all AlertTypes such that each AlertType will specify the exact shape and types of their params and we'll merge these shapes together into one static shape which will be used to define the mappings of these params.

This is the simplest solution in terms of the mapping, but introduces a whole set of challenges in the framework:

  1. What do we do what two AlertTypes use the same name for a field?
  2. How do we handle migrations within a specific AlertType? And what about across all types?
  3. What happens if the shape is wrong? Do we validate ourselves? Rely on plugins?
  4. If this means dynamically merging mappings and types on the "way into the framework" and then exposing it as "portions" of this type on the "way out of the framework" back to the solution - are we introducing a lot of complexity that will be hard to maintain in the future?

Challenges across the board

All of the above options will require changes in the SavedObjectsClient as you can only sort/filter by a rood field at the moment, and supporting "deep" fields inside of "objects" isn't currently supported and doing so will require some further research.

Next steps

As you can see, none of these options are straight forward and there isn't a clear winner.

This all requires a lot more investigation, and playing around with the code locally I think that the second option (Enable dynamic mapping + create a deep object) produces the most maintainable option, but it has potential issues that still need investigation which is what I'll likely be looking into next.

If anyone has thoughts or concerns on these options (or perhaps a 4th option we can investigate) I'm all ears. :)

pmuellr commented 4 years ago

I'm worried about any solution that introduces new mappings for alertType-specific data, due to all the challenges pointed out ^^^.

It's not completely clear that we need an ES solution here; from the SIEM issue https://github.com/elastic/kibana/issues/50222:

Either that or a plain API (even if slower like a table scan) to abstract us away so we can natively to the actions/alerting objects would make it to where we don't have write our own hand rolled solutions.

I read "table scan" as "do as much of a query as you can with ES, then do the remaining filtering/mapping/aggs on the results in JS". That will work if the number of alerts is "reasonable", which I don't know if it is.

I think we should also find out if having an "internal tags" via https://github.com/elastic/kibana/issues/58417 would be good enough for now. This would presumably be a parallel of the current tags structure, but not editable (or probably viewable) in the UI, only programmatically. Not nearly the same as ES fields, but may be useful enough for common needs.

There was also mention of scripted fields in the discussion above, but I'm not sure how we might use them.

gmmorris commented 4 years ago

@FrankHassanabad Would being able to query/filter/sort by a set of internal tags be enough for you? It looks like making the params/config actually searchable would a significant piece of work that we'd need to be cautious before picking up.

m-adams commented 4 years ago

FYI, scripted fields seems to work and maybe at least better than doing it in js.

On Wed, May 13, 2020 at 1:49 PM Gidi Meir Morris notifications@github.com wrote:

@FrankHassanabad https://github.com/FrankHassanabad Would being able to query/filter/sort by a set of internal tags be enough for you? It looks like making the params/config actually searchable would a significant piece of work that we'd need to be cautious before picking up.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/elastic/kibana/issues/50213#issuecomment-627960745, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABYDJMPTGPFNH7JXRPMRVC3RRKJOXANCNFSM4JL2VCDQ .

--

Matthew Adams

{

“title”: "Senior Solution Architect”,

“location”: “5 Southampton St.Covent Garden, London WC2E 7HA”,

“url”: “elastic.co”,

}

Search. Observe. Protect.

gmmorris commented 4 years ago

FYI, scripted fields seems to work and maybe at least better than doing it in js.

Sorry but could you be more specific? Given the limitations we have on making these fields searchable, how would you go about using scripted fields to achieve what SIEM are looking for?

FrankHassanabad commented 4 years ago

table scan

I lifted that term from relation DB's where the definition is:

(also known as a sequential scan) is a scan made on a database where each row of the table is read in a sequential (serial) order and the columns encountered are checked for the validity of a condition. Full table scans are usually the slowest method of scanning a table due to the heavy amount of I/O reads required from the disk which consists of multiple seeks as well as costly disk to memory transfers.[1]

[1] Ref: https://en.wikipedia.org/wiki/Full_table_scan

Don't know if there is a ES adapted term but basically anytime I have to read all of the Saved Objects into memory either through buffering or streaming I count that as a table scan and since it's done through network calls from ES -> Kibana it is adds to the expense.

"do as much of a query as you can with ES, then do the remaining filtering/mapping/aggs on the results in JS"

Yeah that's our ultimate goal. We are hoping to avoid any operation that causes us to iterate over all the alerts in memory from ES -> Kibana due to:

Would being able to query/filter/sort by a set of internal tags be enough for you?

Can we get aggs as well with arrays in Elastic Search? The use case is that sometimes we need to do unique counts of items and display them. If this PR is merged we might have it? https://github.com/elastic/kibana/pull/64002

If we were able to query/filter/sort/aggs I think that covers all of our use cases to avoiding table-like scans.

As an aside:

If you still are going to allow mapping I would suggest combining two parts of your approaches above.

This part:

While we would have one single SavedObject type of alert (as we do now) we can store each alert type's params and each action type's config under a corresponding key based on their unique ID and keep the mapping set to dynamic so that we can query against the internal shapes.

and this part:

Another option is to standardise the shape of params across all AlertTypes such that each AlertType will specify the exact shape and types of their params and we'll merge these shapes together into one static shape which will be used to define the mappings of these params.

So that it becomes this:

While we would have one single SavedObject type of alert (as we do now) we can store each alert type's params and each action type's config under a corresponding key based on their unique ID. Each AlertType will specify the exact shape and types of their params and we'll merge these shapes together into one static shape which will be used to define the mappings of these params.

Then you have exact mappings and avoid mapping explosions and conflicts and the Saved Objects migration system takes over when we update our mappings. Then we are responsible as a team for migrating our mappings altogether.

For what is worth also ... On the community forums and community slack, users are opening up their .kibana saved objects by granting permissions to it and then creating dashboards our rules which includes the params: https://elasticstack.slack.com/archives/CNRTGB9A4/p1589221644238900

I know that might not be exactly what we may want this quickly but it is something we have to keep in mind that users are granting each other privileges to their saved objects index so they can write their own dashboards against static SIEM rules.

Since things are "beta" I think they would be ok with updates but might become frustrated if their dashboards are no longer possible. On the flip side, if this brings more features they couldn't have before such as sorting/filtering/querying/aggs then they will be very delighted even if we have to advise them on how to update there existing dashboards.

This might be an unsupported or discouraged thing that users are opening up saved object indexes? However I want to point it out as it's already happened.

m-adams commented 4 years ago

FYI, scripted fields seems to work and maybe at least better than doing it in js.

Sorry but could you be more specific? Given the limitations we have on making these fields searchable, how would you go about using scripted fields to achieve what SIEM are looking for?

If, for example, you create a filtered alias to just pick the rules from .kibana then add an index pattern in kibana with this scripted field using painless

def mitre = []; for (value in params['_source']['alert']['params']['threat']) {mitre.add(value['tactic']['id'])} return mitre

You can then aggregate on the tactic ID for reporting. Matthew

kobelb commented 4 years ago

Are sorting and aggregations absolutely necessary? If not, https://github.com/elastic/elasticsearch/issues/33003 is a potential solution worth evaluating.

mikecote commented 4 years ago

New possible solution mentioned in option 4 of #67290. That issue explores options to solve Elasticsearch merging objects on update when the mapping has enabled: false.

This may not be a good approach but worth mentioning. Probably doesn't solve sorting or aggregations which would make https://github.com/elastic/elasticsearch/issues/33003 more a solution worth evaluating as @kobelb mentioned.

Change object structure to array

I noticed the alert's actions[x].params attribute doesn't have this problem. There is a possibility that storing alert params and action configs into an array structure would solve this problem and possibly also make them searchable.

The params value structure could be something like the following:

[
 {
   "name": "index",
   "value": "foo"
 },
 {
   "name": "timeField",
   "value": "@timestamp"
 }
]

This would allow a consistent mapping of name and value where value can be enabled: false but won't be impacted by this issue (due to being within an array). This would also require a saved object migration.

The step further that would be required to make the values searchable would be do split the values into different mapped fields. This would require orchestration between field name and value field.

Screen Shot 2020-05-25 at 1 47 15 PM

To query this data, it would look something like this:

Screen Shot 2020-05-25 at 2 00 19 PM
FrankHassanabad commented 4 years ago

Are sorting and aggregations absolutely necessary?

Yes. We have params such as risk_score and a lot of users internal and external to our organization are asking why we cannot sort our tables or queries based on these important items.

shahzad31 commented 4 years ago

This would be awesome thing for uptime alerts as well, we will be able to determine which alerts are enabled or not.

mikecote commented 4 years ago

we will be able to determine which alerts are enabled or not.

You can filter on that today, no? Using KQL

shahzad31 commented 4 years ago

we will be able to determine which alerts are enabled or not.

You can filter on that today, no? Using KQL

@mikecote first we need to determine how can we find an alert via params. if params aren't searchable, we cant find an alert. What are the alternative ways to find an alert, can we add custom tag to an alert while creating? or maybe specify an alert name in the flyout, and dont let user change it.

shahzad31 commented 3 years ago

@mikecote @gmmorris any update on this? Any plans for 7.10 or 7.11 for this?

mikecote commented 3 years ago

@shahzad31 we haven't agreed on a path forward for this yet so we don't have plans to work on this yet. We are planning in the meantime to add support for internal tags in 7.11 (https://github.com/elastic/kibana/issues/58417).

dontcallmesherryli commented 3 years ago

@mikecote we are beginning to get a lot of feedback from internal Elastic rule creators as well as external users that our 200+ prebuilt detection engine rules is painful to sort and find in the Rules page. As we are expecting to continue to ship more prebuilt rules in each release, I'd like to discuss the possibility of putting this issue on the 7.11 priority list of the Kibana Alerting team.

cc: @arisonl @peluja1012 @spong @MikePaquette

mikecote commented 3 years ago

@dontcallmesherryli We put it in our 7.12 tentative plan for now. We will keep the implementation discussion going until then to have an idea what approach to take.

pmuellr commented 3 years ago

This came up again when I talked to the o11y folks this week. I keep hoping that our "system tags" issue may end up solving this. @sqren noted that this would then end up being painful trying to keep params AND tags up-to-date - it's a least more boilerplate-y code that alerts clients would have to manage.

One thought on that would be to provide some kind of "helper" function instead of code, which given a params object would return a set of "field tags" or such. Here's an example for index threshold alert type, imagining you might want to eventually be able to search on the index used in the alert:

function fieldTags(params: IndexThresholdParams) {
  return {
    'index': params.index
  }
}

We would call this function (defined in the alert type, but it's optional of course), and then add these to the "system" tags with the resulting data.

Another issue with this is that my thoughts on the "system tags" is that they certainly would not be updateble via HTTP, only the in-process alert client, and it's possible we may not want these readable either. But I think we'd want these "field tags" to be readable/searchable via http, so we can make use of these in Kibana.

mikecote commented 3 years ago

cc @arisonl

mikecote commented 3 years ago

We should do an investigation to see if https://www.elastic.co/guide/en/elasticsearch/reference/7.x/flattened.html can be used as a solution. Adding the research label.

mikecote commented 3 years ago

Another option worth investigating is runtime fields https://www.elastic.co/guide/en/elasticsearch/reference/master/runtime.html.

mikecote commented 3 years ago

We should do an investigation to see if https://www.elastic.co/guide/en/elasticsearch/reference/7.x/flattened.html can be used as a solution. Adding the research label.

I did a small spike on this (see: https://github.com/elastic/kibana/compare/master...mikecote:alerting/searchable-params-poc) and it seems using the flattened type could solve most of the problems with a bit of work on the validation code. The first limitation I see is, by design, treating every value as a string which brings limitations to comparators (and sorting) on numeric values.

In the spike, I can call the alerting APIs with the following:

// KQL filter for a value
curl -XGET "https://elastic:changeme@localhost:5601/rgp/api/alerts/_find?filter=alert.attributes.params.timeWindowSize:100" -k

// Sort where it uses string comparators (meaning 99 > 100)
curl -XGET "https://elastic:changeme@localhost:5601/rgp/api/alerts/_find?sort_field=params.timeWindowSize&sort_order=asc" -k
gmmorris commented 3 years ago

Nice work @mikecote , we should make a clear list of what flattened can support and what it can't, then validate with some teams (SIEM, O11y...) that they can live without the things it can't support.

sorenlouv commented 3 years ago

Agree, this is great!

The first limitation I see is, by design, treating every value as a string which brings limitations to comparators (and sorting) on numeric values.

Having a simple string-based key-value pair lookup mechanism will solve all of our current use cases and 99% of our future use cases. So that limitation is definitely tolerable.

mikecote commented 3 years ago

we should make a clear list of what flattened can support and what it can't, then validate with some teams (SIEM, O11y...) that they can live without the things it can't support.

The two teams I recall requesting this is APM and Security Detections team.

Looks like @sqren is happy from the APM side and I'm curious what @spong on the detections team thinks of the limitations?

The limitations are documented here: https://www.elastic.co/guide/en/elasticsearch/reference/master/flattened.html#supported-operations.

Supported operations

Because of the similarities in the way values are indexed, flattened fields share much of the same mapping and search functionality as keyword fields.

Currently, flattened object fields can be used with the following query types:

  • term, terms, and terms_set
  • prefix
  • range
  • match and multi_match
  • query_string and simple_query_string
  • exists

When querying, it is not possible to refer to field keys using wildcards, as in { "term": {"labels.time*": 1541457010}}. Note that all queries, including range, treat the values as string keywords. Highlighting is not supported on flattened fields.

It is possible to sort on an flattened object field, as well as perform simple keyword-style aggregations such as terms. As with queries, there is no special support for numerics — all values in the JSON object are treated as keywords. When sorting, this implies that values are compared lexicographically.

Flattened object fields currently cannot be stored. It is not possible to specify the store parameter in the mapping.

mikecote commented 3 years ago

@spong @sqren as I'm working on this, one thing worth mentioning is the params would be treated like string keyword. This means that doing partial searches, etc won't work using the flattened approach. Is this still an acceptable solution for Security and Observability? Based on @sqren's comment, it seems like string based key-value lookup will work for Observability.

I'll be working on a PR here but wanted to double check before going further.

mikecote commented 3 years ago

I'm renaming the issue from Make alert params and action config searchable to Make alert params searchable since the actions plugin doesn't have a find API (only get all API) at this time.

sorenlouv commented 3 years ago

one thing worth mentioning is the params would be treated like string keyword

sgtm 👍

spong commented 3 years ago

@spong @sqren as I'm working on this, one thing worth mentioning is the params would be treated like string keyword. This means that doing partial searches, etc won't work using the flattened approach. Is this still an acceptable solution for Security and Observability?

So I think this is going to be a little problematic on the Security side with the fields we're looking to expose via search within Rules Management. For instance, one of the most requested searchable fields is MITRE ATT&CK Threats/Techniques, so for the below rule, users would expect to be able to search just for T1555 or Password Stores for example, and get back any rule with the below technique:

Here's our alertParams, of which I think the following will be desired to be searched against, and would be problematic with not being able to search for partials (or case insensitivity):

pmuellr commented 3 years ago

I see runtime fields referenced in a comment above, but guessing we haven't looked at it in a lot of detail. And I imagine it gets complicated, pretty quick. Like we'd actually need mapping definitions for the fields, which we don't have today. Somehow augmenting the KQL-generated query to add the runtime bits. Etc.

banderror commented 3 years ago

Hey @mikecote and alerting team! Thank you for taking over this problem.

I recently started looking at what we (Security Solution, Detections folks) can do to improve UX of the detection rules management table.

Sorry for somewhat delayed comment from our side, it took a while to collect some requirements and check the proposed solution using the flattened field type. Here's our findings and observations.

Needs from the Security Solution side

We now have 461 built-in rules in the solution, we expect this number to grow probably by a few hundred per each release, plus there are custom rules, so we'd like to provide our users the most flexible filtering, searching and sorting experience, so they could quickly find and enable/disable/bulk edit only relevant rules.

We'd like to be able to build the following improvements for the rules management table:

We're planning to start small and build simple filters first (severity etc), but eventually I think we need all of those features since it would improve UX for our users quite substantially.

Detection rules data and mappings

I created a new detection rule and specified some of the typical fields. Just to be on the same page, let's check the data that has been indexed into .kibana, and mappings for this data.

Queries ```json # Detection rules GET .kibana/_search { "query": { "bool": { "filter": [ { "match": { "type": "alert" } }, { "match": { "alert.alertTypeId": "siem.signals" } }, { "match": { "alert.consumer": "siem" } } ] } } } # Saved object mappings GET .kibana/_mapping ```
A typical detection rule ```json { "type" : "alert", "alert" : { "alertTypeId" : "siem.signals", "enabled" : false, "name" : "Test rule for flattened field type", "tags" : [ "my-rule", "flattened Field", "My tag", "__internal_rule_id:ae88ab01-5c78-43fc-8166-bf692567ccae", "__internal_immutable:false" ], "params" : { "author" : [ "Me", "Myself", "And I" ], "buildingBlockType" : null, "description" : "Testing https://www.elastic.co/guide/en/elasticsearch/reference/master/flattened.html", "ruleId" : "ae88ab01-5c78-43fc-8166-bf692567ccae", "falsePositives" : [ "First false positive example", "Second false positive example" ], "from" : "now-3720s", "immutable" : false, "license" : "Elastic License", "outputIndex" : ".siem-signals-default", "timelineId" : null, "timelineTitle" : null, "meta" : { "from" : "20m", "kibana_siem_app_url" : "http://localhost:5601/kbn/app/security" }, "maxSignals" : 100, "riskScore" : 42, "riskScoreMapping" : [ { "field" : "checkpoint.content_risk", "operator" : "equals", "value" : "" } ], "ruleNameOverride" : null, "severity" : "high", "severityMapping" : [ { "field" : "host.ip", "value" : "123.456.78.89", "operator" : "equals", "severity" : "low" }, { "field" : "host.ip", "value" : "123.456.78.90", "operator" : "equals", "severity" : "medium" }, { "field" : "host.ip", "value" : "123.456.78.91", "operator" : "equals", "severity" : "high" }, { "field" : "host.ip", "value" : "123.456.78.92", "operator" : "equals", "severity" : "critical" } ], "threat" : [ { "framework" : "MITRE ATT&CK", "tactic" : { "id" : "TA0001", "reference" : "https://attack.mitre.org/tactics/TA0001", "name" : "Initial Access" }, "technique" : [ { "id" : "T1190", "reference" : "https://attack.mitre.org/techniques/T1190", "name" : "Exploit Public-Facing Application", "subtechnique" : [ ] }, { "id" : "T1566", "reference" : "https://attack.mitre.org/techniques/T1566", "name" : "Phishing", "subtechnique" : [ { "id" : "T1566.002", "reference" : "https://attack.mitre.org/techniques/T1566/002", "name" : "Spearphishing Link" }, { "id" : "T1566.003", "reference" : "https://attack.mitre.org/techniques/T1566/003", "name" : "Spearphishing via Service" } ] } ] }, { "framework" : "MITRE ATT&CK", "tactic" : { "id" : "TA0011", "reference" : "https://attack.mitre.org/tactics/TA0011", "name" : "Command and Control" }, "technique" : [ { "id" : "T1071", "reference" : "https://attack.mitre.org/techniques/T1071", "name" : "Application Layer Protocol", "subtechnique" : [ { "id" : "T1071.004", "reference" : "https://attack.mitre.org/techniques/T1071/004", "name" : "DNS" }, { "id" : "T1071.001", "reference" : "https://attack.mitre.org/techniques/T1071/001", "name" : "Web Protocols" } ] }, { "id" : "T1043", "reference" : "https://attack.mitre.org/techniques/T1043", "name" : "Commonly Used Port", "subtechnique" : [ ] } ] } ], "timestampOverride" : null, "to" : "now", "references" : [ "https://www.elastic.co/guide/en/elasticsearch/reference/master/flattened.html", "https://github.com/elastic/kibana/pull/87835" ], "note" : "## Guide Provide helpful information for analysts that are investigating **detection alerts**. This guide will appear on the rule details page and in _timelines_ (as notes) created from detection alerts generated by this rule.", "version" : 1, "exceptionsList" : [ ], "type" : "query", "language" : "kuery", "index" : [ "apm-*-transaction*", "auditbeat-*", "endgame-*", "filebeat-*", "logs-*", "packetbeat-*", "winlogbeat-*" ], "query" : "host.name: \"gg-mac-42\" or host.name: gg-mac-166*", "filters" : [ ], "savedId" : null, "threatFilters" : null } } } ```
Mappings used for detection rules (and all "alert" saved objects in general) ![](https://puu.sh/H8o2B/ccc7f26032.png) ```json { "mappings": { "dynamic" : "strict", "properties": { "type" : { "type" : "keyword" }, "alert" : { "properties" : { "actions" : { "type" : "nested", "properties" : { "actionRef" : { "type" : "keyword" }, "actionTypeId" : { "type" : "keyword" }, "group" : { "type" : "keyword" }, "params" : { "type" : "object", "enabled" : false } } }, "alertTypeId" : { "type" : "keyword" }, "apiKey" : { "type" : "binary" }, "apiKeyOwner" : { "type" : "keyword" }, "consumer" : { "type" : "keyword" }, "createdAt" : { "type" : "date" }, "createdBy" : { "type" : "keyword" }, "enabled" : { "type" : "boolean" }, "executionStatus" : { "properties" : { "error" : { "properties" : { "message" : { "type" : "keyword" }, "reason" : { "type" : "keyword" } } }, "lastExecutionDate" : { "type" : "date" }, "status" : { "type" : "keyword" } } }, "meta" : { "properties" : { "versionApiKeyLastmodified" : { "type" : "keyword" } } }, "muteAll" : { "type" : "boolean" }, "mutedInstanceIds" : { "type" : "keyword" }, "name" : { "type" : "text", "fields" : { "keyword" : { "type" : "keyword" } } }, "notifyWhen" : { "type" : "keyword" }, "params" : { "type" : "object", "enabled" : false }, "schedule" : { "properties" : { "interval" : { "type" : "keyword" } } }, "scheduledTaskId" : { "type" : "keyword" }, "tags" : { "type" : "keyword" }, "throttle" : { "type" : "keyword" }, "updatedAt" : { "type" : "date" }, "updatedBy" : { "type" : "keyword" } } } } } } ```

Proposed implementation using flattened field type

So the proposed implementation using the flattened field type boils down to the following, right? Please correct me if I misunderstood this part.

change this:
{
  "mappings": {
    "dynamic" : "strict",
    "properties": {
      "alert" : {
        "properties" : {
          "params" : {
            "type" : "object",
            "enabled" : false
          }
        }
      }
    }
  }
}

to this:
{
  "mappings": {
    "dynamic" : "strict",
    "properties": {
      "alert" : {
        "properties" : {
          "params" : {
            "type" : "flattened"
          }
        }
      }
    }
  }
}

Testing flattened field type

I've checked various scenarious relevant to our needs, compared behaviour of flattened field with normal fields with specified mappings. Here are my findings.

TL;DR: I don't think flattened field type would provide us the ability to build the UX we're planning to build.

Although these features are supported:

See my tests below.

Filtering and sorting numbers

Paste to Kibana Dev Tools ```json # ------------------------------------------------------------------------------ # -- Filtering and sorting numbers DELETE delme-test-rule-filtering-number PUT delme-test-rule-filtering-number { "mappings": { "dynamic" : "strict", "properties": { "params": { "type" : "flattened" }, "riskScore": { "type": "short" } } } } PUT delme-test-rule-filtering-number/_doc/1 { "params": { "riskScore": 25 }, "riskScore": 25 } PUT delme-test-rule-filtering-number/_doc/2 { "params": { "riskScore": 99 }, "riskScore": 99 } PUT delme-test-rule-filtering-number/_doc/3 { "params": { "riskScore": 100 }, "riskScore": 100 } # 25 <= x <= 100. Normal mapping. Result: 100, 99, 25. GET delme-test-rule-filtering-number/_search { "query": { "bool": { "filter": [ { "range": { "riskScore": { "gte": 25, "lte": 100 } } } ] } }, "sort": [{ "riskScore": "desc" }] } # 25 <= x <= 99. Flattened field. Result: 99, 25. (lexicographical comparison) GET delme-test-rule-filtering-number/_search { "query": { "bool": { "filter": [ { "range": { "params.riskScore": { "gte": 25, "lte": 99 } } } ] } }, "sort": [{ "params.riskScore": "desc" }] } # 25 <= x <= 100. Flattened field. Result: no hits. (due to lexicographical comparison) GET delme-test-rule-filtering-number/_search { "query": { "bool": { "filter": [ { "range": { "params.riskScore": { "gte": 25, "lte": 100 } } } ] } }, "sort": [{ "params.riskScore": "desc" }] } ```

Filtering and sorting string keywords

Paste to Kibana Dev Tools ```json # ------------------------------------------------------------------------------ # -- Filtering and sorting string keywords DELETE delme-test-rule-filtering-keyword PUT delme-test-rule-filtering-keyword { "mappings": { "dynamic" : "strict", "properties": { "params": { "type" : "flattened" }, "severity": { "type": "keyword" } } } } PUT delme-test-rule-filtering-keyword/_doc/1 { "params": { "severity": "low" }, "severity": "low" } PUT delme-test-rule-filtering-keyword/_doc/2 { "params": { "severity": "medium" }, "severity": "medium" } PUT delme-test-rule-filtering-keyword/_doc/3 { "params": { "severity": "high" }, "severity": "high" } PUT delme-test-rule-filtering-keyword/_doc/4 { "params": { "severity": "critical" }, "severity": "critical" } # Normal mapping. Result: critical, high. GET delme-test-rule-filtering-keyword/_search { "query": { "bool": { "filter": [ { "terms": { "severity": ["high", "critical"] } } ] } }, "sort": [{ "severity": "asc" }] } # Flattened field. Result: critical, high. GET delme-test-rule-filtering-keyword/_search { "query": { "bool": { "filter": [ { "terms": { "params.severity": ["high", "critical"] } } ] } }, "sort": [{ "params.severity": "asc" }] } ```

Full text search

Paste to Kibana Dev Tools ```json # ------------------------------------------------------------------------------ # -- Full text search DELETE delme-test-rule-filtering-text PUT delme-test-rule-filtering-text { "mappings": { "dynamic" : "strict", "properties": { "params": { "type" : "flattened" }, "description": { "type": "text" }, "note": { "type": "text" }, "falsePositives": { "type": "text" }, "author": { "type": "text" } } } } PUT delme-test-rule-filtering-text/_doc/1 { "params": { "description" : "Testing flattened field type", "note" : "Here's helpful information: https://www.elastic.co/guide/en/elasticsearch/reference/master/flattened.html", "falsePositives" : [ "First false positive example", "Second false positive example" ], "author" : ["First author", "Second author", "Third author"] }, "description" : "Testing flattened field type", "note" : "Here's helpful information: https://www.elastic.co/guide/en/elasticsearch/reference/master/flattened.html", "falsePositives" : [ "First false positive example", "Second false positive example" ], "author" : ["First author", "Second author", "Third author"] } # Normal mapping. Result: found. GET delme-test-rule-filtering-text/_search { "query": { "bool": { "must": [ { "multi_match": { "query": "flatten", "fields": ["description", "note", "falsePositives", "author"], "type": "best_fields", "fuzziness": "auto" } } ] } } } # Flattened field. Result: failed to create query: [fuzzy] queries are not currently supported on keyed [flattened] fields. GET delme-test-rule-filtering-text/_search { "query": { "bool": { "must": [ { "multi_match": { "query": "flatten", "fields": ["params.description", "params.note", "params.falsePositives", "params.author"], "type": "best_fields", "fuzziness": "auto" } } ] } } } # Flattened field. Removed fuzziness. Result: no hits. GET delme-test-rule-filtering-text/_search { "query": { "bool": { "must": [ { "multi_match": { "query": "flattened", "fields": ["params.description", "params.note", "params.falsePositives", "params.author"], "type": "best_fields" } } ] } } } # Flattened field. Provided the whole description. Result: found. GET delme-test-rule-filtering-text/_search { "query": { "bool": { "must": [ { "multi_match": { "query": "Testing flattened field type", "fields": ["params.description", "params.note", "params.falsePositives", "params.author"], "type": "best_fields" } } ] } } } ```

Filtering nested objects

Paste to Kibana Dev Tools ```json # ------------------------------------------------------------------------------ # -- Filtering nested objects DELETE delme-test-rule-filtering-nested PUT delme-test-rule-filtering-nested { "mappings": { "dynamic" : "strict", "properties": { "params": { "type" : "flattened" }, "threat" : { "properties" : { "framework" : { "type" : "text", "fields" : { "keyword" : { "type" : "keyword" } } }, "tactic" : { "properties" : { "id" : { "type" : "text", "fields" : { "keyword" : { "type" : "keyword" } } }, "name" : { "type" : "text", "fields" : { "keyword" : { "type" : "keyword" } } }, "reference" : { "type" : "text" } } }, "technique" : { "properties" : { "id" : { "type" : "text", "fields" : { "keyword" : { "type" : "keyword" } } }, "name" : { "type" : "text", "fields" : { "keyword" : { "type" : "keyword" } } }, "reference" : { "type" : "text" }, "subtechnique" : { "properties" : { "id" : { "type" : "text", "fields" : { "keyword" : { "type" : "keyword" } } }, "name" : { "type" : "text", "fields" : { "keyword" : { "type" : "keyword" } } }, "reference" : { "type" : "text" } } } } } } } } } } PUT delme-test-rule-filtering-nested/_doc/1 { "params": { "threat" : [ { "framework" : "MITRE ATT&CK", "tactic" : { "id" : "TA0001", "reference" : "https://attack.mitre.org/tactics/TA0001", "name" : "Initial Access" }, "technique" : [ { "id" : "T1190", "reference" : "https://attack.mitre.org/techniques/T1190", "name" : "Exploit Public-Facing Application", "subtechnique" : [ ] }, { "id" : "T1566", "reference" : "https://attack.mitre.org/techniques/T1566", "name" : "Phishing", "subtechnique" : [ { "id" : "T1566.002", "reference" : "https://attack.mitre.org/techniques/T1566/002", "name" : "Spearphishing Link" }, { "id" : "T1566.003", "reference" : "https://attack.mitre.org/techniques/T1566/003", "name" : "Spearphishing via Service" } ] } ] }, { "framework" : "MITRE ATT&CK", "tactic" : { "id" : "TA0011", "reference" : "https://attack.mitre.org/tactics/TA0011", "name" : "Command and Control" }, "technique" : [ { "id" : "T1071", "reference" : "https://attack.mitre.org/techniques/T1071", "name" : "Application Layer Protocol", "subtechnique" : [ { "id" : "T1071.004", "reference" : "https://attack.mitre.org/techniques/T1071/004", "name" : "DNS" }, { "id" : "T1071.001", "reference" : "https://attack.mitre.org/techniques/T1071/001", "name" : "Web Protocols" } ] }, { "id" : "T1043", "reference" : "https://attack.mitre.org/techniques/T1043", "name" : "Commonly Used Port", "subtechnique" : [ ] } ] } ] }, "threat" : [ { "framework" : "MITRE ATT&CK", "tactic" : { "id" : "TA0001", "reference" : "https://attack.mitre.org/tactics/TA0001", "name" : "Initial Access" }, "technique" : [ { "id" : "T1190", "reference" : "https://attack.mitre.org/techniques/T1190", "name" : "Exploit Public-Facing Application", "subtechnique" : [ ] }, { "id" : "T1566", "reference" : "https://attack.mitre.org/techniques/T1566", "name" : "Phishing", "subtechnique" : [ { "id" : "T1566.002", "reference" : "https://attack.mitre.org/techniques/T1566/002", "name" : "Spearphishing Link" }, { "id" : "T1566.003", "reference" : "https://attack.mitre.org/techniques/T1566/003", "name" : "Spearphishing via Service" } ] } ] }, { "framework" : "MITRE ATT&CK", "tactic" : { "id" : "TA0011", "reference" : "https://attack.mitre.org/tactics/TA0011", "name" : "Command and Control" }, "technique" : [ { "id" : "T1071", "reference" : "https://attack.mitre.org/techniques/T1071", "name" : "Application Layer Protocol", "subtechnique" : [ { "id" : "T1071.004", "reference" : "https://attack.mitre.org/techniques/T1071/004", "name" : "DNS" }, { "id" : "T1071.001", "reference" : "https://attack.mitre.org/techniques/T1071/001", "name" : "Web Protocols" } ] }, { "id" : "T1043", "reference" : "https://attack.mitre.org/techniques/T1043", "name" : "Commonly Used Port", "subtechnique" : [ ] } ] } ] } # Normal mapping. Result: found. GET delme-test-rule-filtering-nested/_search { "query": { "bool": { "must": [ { "multi_match": { "query": "t1566 spearphishing", "fields": [ "threat.framework", "threat.*.id", "threat.*.reference", "threat.*.name" ], "type": "cross_fields" } } ] } } } # Flattened field. Note: flattened type doesn't support wildcard keys, so I had to change them to exact keys. # Result: no hits. GET delme-test-rule-filtering-nested/_search { "query": { "bool": { "must": [ { "multi_match": { "query": "t1566 spearphishing", "fields": [ "params.threat.framework", "params.threat.tactic.id", "params.threat.tactic.reference", "params.threat.tactic.name", "params.threat.technique.id", "params.threat.technique.reference", "params.threat.technique.name", "params.threat.technique.subtechnique.id", "params.threat.technique.subtechnique.reference", "params.threat.technique.subtechnique.name" ], "type": "cross_fields" } } ] } } } # Flattened field. Specified whole exact text. # Result: found. GET delme-test-rule-filtering-nested/_search { "query": { "bool": { "must": [ { "multi_match": { "query": "Spearphishing via Service", "fields": [ "params.threat.framework", "params.threat.tactic.id", "params.threat.tactic.reference", "params.threat.tactic.name", "params.threat.technique.id", "params.threat.technique.reference", "params.threat.technique.name", "params.threat.technique.subtechnique.id", "params.threat.technique.subtechnique.reference", "params.threat.technique.subtechnique.name" ], "type": "cross_fields" } } ] } } } # Flattened field. Possible to query whole "params" object w/o specifying fields. # Result: found. GET delme-test-rule-filtering-nested/_search { "query": { "bool": { "must": [ { "multi_match": { "query": "Spearphishing via Service", "fields": [ "params" ], "type": "cross_fields" } } ] } } } ```

Filter + search + sorting combined

Here I'm not using flattened field, just outlining some a la real-world rule data and a la real-world queries we'd like to run against it as an example.

Paste to Kibana Dev Tools ```json # ------------------------------------------------------------------------------ # -- Filter + search + sorting combined DELETE delme-test-rule-filtering-composite PUT delme-test-rule-filtering-composite { "mappings": { "dynamic" : "strict", "properties": { "riskScore": { "type": "short" }, "severity": { "type": "keyword" }, "description": { "type": "text" }, "author": { "type": "text" }, "threat" : { "properties": { "id" : { "type" : "text", "fields" : { "keyword" : { "type" : "keyword" } } }, "name" : { "type" : "text", "fields" : { "keyword" : { "type" : "keyword" } } }, "reference" : { "type" : "text" } } } } } } PUT delme-test-rule-filtering-composite/_doc/1 { "riskScore": 70, "severity": "high", "description" : "Testing flattened field type 1", "author" : ["First author", "Second author", "Third author"], "threat": [ { "id" : "T1071", "reference" : "https://attack.mitre.org/techniques/T1071", "name" : "Application Layer Protocol" }, { "id" : "T1566", "reference" : "https://attack.mitre.org/techniques/T1566", "name" : "Phishing" } ] } PUT delme-test-rule-filtering-composite/_doc/2 { "riskScore": 100, "severity": "critical", "description" : "Testing flattened field type 2", "author" : ["First author", "Second author", "Third author"], "threat": [ { "id" : "T1190", "reference" : "https://attack.mitre.org/techniques/T1190", "name" : "Exploit Public-Facing Application" }, { "id" : "T1566", "reference" : "https://attack.mitre.org/techniques/T1566", "name" : "Phishing" } ] } # Result: 1 hit GET delme-test-rule-filtering-composite/_search { "query": { "bool": { "filter": [ { "range": { "riskScore": { "gte": 70, "lte": 100 } } }, { "terms": { "severity": ["high", "critical"] } }, { "term": { "threat.id.keyword": "T1071" } } ], "must": [ { "multi_match": { "query": "phishing third author", "fields": ["description", "author", "threat.*"], "type": "cross_fields" } } ] } }, "sort": [ { "riskScore": "desc" }, { "severity": "asc" }, "_score" ], "aggs": { "threat-ids": { "terms": { "field": "threat.id.keyword" } } } } ```

After testing

# Don't forget to
DELETE delme-test-rule-filtering-*

Other solutions?

Are there other feasible solutions for filtering/sorting/searching?

Previously, we were considering loading all the rules in the browser app, and implementing custom filtering, searching and sorting in JS in memory. Not an ideal solution to say the least.

Would it be feasible to store rule parameters in a separate, dedicated index with mappings specific to detection rules?

We already have some objects related to detection rules that are stored in separate indices:

# Detection rule statuses
GET .kibana/_search
{
  "query": {
    "bool": {
      "filter": [
        {
          "match": {
            "type": "siem-detection-engine-rule-status"
          }
        }
      ]
    }
  }
}

# Detection rule actions
GET .kibana/_search
{
  "query": {
    "bool": {
      "filter": [
        {
          "match": {
            "type": "siem-detection-engine-rule-actions"
          }
        }
      ]
    }
  }
}
mikecote commented 3 years ago

@banderror, thank you for the very detailed analysis 💯

I've done a few POCs with the requirements of searching, sorting and filtering on any type of field (string, number, etc.) and below are my findings.

Usage of flattened type 👎

We cannot use the flattened type because it analyzes the data as keyword, which doesn't support searching nor sorting on numeric values. I've removed this as an option.

Usage of runtime fields 👎

We cannot use runtime fields because it only supports the following data types: boolean, date, double, geo_point, IP, keyword and long. It doesn't support text, which prevents searching. Runtime fields also don't work if search.allow_expensive_queries is disabled. I've removed this as an option.

What next?

I'm going to explore creating a deep object with static mapping similar to @gmmorris' Enable dynamic mapping + create a deep object option here. If a prototype works out well, alert types could opt-in to using a static mapping for some / all of their parameters.

pmuellr commented 3 years ago

A couple of other ideas:

re-post (again, sorry) of "internal tags" - https://github.com/elastic/kibana/issues/58417

The idea of this would be that we add a new field internal_tags which would be a nested array of {key: string, text: string, value: number}. We'd do the usual keyword/text thing with the text field (to support both keyword and text searches); to handle numeric values, there's also a value field.

This won't be useful in a lot of places in Kibana, since nested fields are supported everywhere (eg, various visualizations), but we'd generally just be using it internally, as part of the search that we end up building. I believe KQL does not support nested fields either, so that may be a big downside. Or complication, if we have to hack something into the existing SO find APIs.

Solutions using alerting could then add whatever internal tags they want. And presumably make use of search capabilities in the alertsClient, that would support this. But we likely wouldn't want to expose these tags to end users, at least for search (just because we'd have to document what values were being stored there), and certainly not to update. We could support adding these on create and update from the alertsClient though.

create a parallel search-possible version of the params

This is kinda similar to the "internal tags", and would use the same sort of "nested" field of key/text/number. But in this case, we "flatten" all the params into single strings - {a: {b: 1}} becomes a.b: 1 - and then create a nested element for each key/value pair.

Again, we probably wouldn't end up exposing this field as something customers could use as a search target, at least for now - and they probably can't anyway due to lack of KQL support for nested fields.

mikecote commented 3 years ago

I've been working on a POC that creates a deep object with static mappings. I made the index threshold alert opt-in to use static mappings instead of storing its parameters in the enabled: false mapping.

It seems to work well with a couple of things worth noting so far:

  1. This approach would require pre-save / post-read logic in the alerts client to nest/unnest the parameters.
  2. Using the alert type id as the sub-property name, it can't have . in them. We would have to convert . to something else (ex: siem.signals to siem_signals). Also, add protection to prevent overlapping names.
  3. We may have to bump up the field mappings limit in the .kibana index. I spoke with @rudolf, and it seems ok to bump the limit for now.
  4. A migration would be required to opt-in an alert type to have their parameters searchable.
  5. This could create migration errors at the mappings level. Alert types would have to make sure their mappings work for what they allow storing in parameters.
  6. Alert types that don't opt-in for custom mappings get their parameters stored in a generic variable mapped enabled: false.
  7. Searching, sorting and filtering these parameters would require API users to understand the underlying structure that these parameters are stored. Changing this down the line could break their requests. There may be a way to create some limitations here and handle the underlying structure ourselves.
  8. The saved objects service would need to support dynamic mappings (see: https://github.com/elastic/kibana/issues/88988). In my POC, I have an approach the core team is happy with.

@elastic/kibana-alerting-services there is a lot of go through on the above. I am thinking a synchronous demo/discussion may be necessary for this topic?

mikecote commented 3 years ago

@pmuellr yeah, I think we'll have to keep those ideas on the table as well for sure.

pmuellr commented 3 years ago

This approach in Mike's POC ^^^ seems workable. Is making this an opt-in for alert types something we want to leave in long-term? Or will we make it mandatory at some point. Thinking it would be nice to not have two paths, at least long-term.

mikecote commented 3 years ago

This approach in Mike's POC ^^^ seems workable. Is making this an opt-in for alert types something we want to leave in long-term? Or will we make it mandatory at some point. Thinking it would be nice to not have two paths, at least long-term.

I think if we go opt-in for now, making it mandatory long-term would be great. Similar to validation as it is opt-in right now. Maybe 8.0 if we go this path?

mikecote commented 3 years ago

I've also worked on a POC that splits the alert saved object type by alert type. This is the same as the "Create a Saved Object type for each AlertType" approach that @gmmorris mentioned. This approach allows having full parameter mappings for each alert type.

Here are the good things I discovered during the POC:

Here are the bad things I discovered during the POC:

I want to explore creating a separate saved object for the params next and see how that goes. There are still other options outside of these, but it is good to know each's pros/cons to make sure we're making the right investment for the long term.

mikecote commented 3 years ago

I want to explore creating a separate saved object for the params next and see how that goes.

I took a quick look at this approach. It would rely on using the join field type, and has_child queries. Since the requirements are to use the same field across both saved object types, it doesn't seem like a good approach since saved object attributes are stored in different fields in Elasticsearch by design. There may be alternatives in this approach, but they don't seem worth investigating to me at this time.

mikecote commented 3 years ago

After looking at all the options that are on the table, I want to propose going with the "deep object with static mappings" approach mentioned here.

Usage

This approach checks the boxes for filtering, searching, and sorting by doing something like the following on my POC:

Addressing the concerns

I have raised a few down points, but most of them are not an issue if I think long-term. Below is how I see each being resolved:

  1. This approach would require pre-save / post-read logic in the alerts client to nest/unnest the parameters.

This would be a good time to split the data access logic out of the alerts client and create a separate file. This is where the pre-save and post-read logic can live, and it would also keep the alerts client smaller as more code moves over.

  1. Using the alert type id as the sub-property name, it can't have . in them. We would have to convert . to something else (ex: siem.signals to siem_signals). Also, add protection to prevent overlapping names.

However we decide to map these parameters, we will have to distinguish them by alert type id. Hence, I think long term, we may want to prevent using . in the ids but in the meantime, come up with a common function that can convert an alertTypeId to something consumable by Elasticsearch.

  1. We may have to bump up the field mappings limit in the .kibana index. I spoke with @rudolf, and it seems ok to bump the limit for now.

This approach is not as bad as the saved object type per alert type as only the parameters are added to the mappings and only the alert types that opt-in for now. In the future, and maybe as we rename alert saved object type to rule, we can move them into their own .kibana_rules index.

  1. A migration would be required to opt-in an alert type to have their parameters searchable.

This problem is short-term until we would make all the alert types require mappings for their parameters. I wouldn't make a design decision based on this. If we allow alert types to define migrations (https://github.com/elastic/kibana/issues/50216), a simple no-op migration would move their params from the default enabled: false spot to their mapped version.

  1. This could create migration errors at the mappings level. Alert types would have to make sure their mappings work for what they allow storing in parameters.

This problem will exist for any approach that requires mapping. There is a high likelihood that errors here will happen, and we should work with the core team to see what we can do to ensure Kibana can still start when this migration fails. Alert types that use config-schema are already one step ahead at making sure the data equals the mappings, and we can make the validation mandatory for all alert types first.

  1. Alert types that don't opt-in for custom mappings get their parameters stored in a generic variable mapped enabled: false.

This problem is short-term as well. I think long term, we should force all alert types to define their mappings. Here as well, I wouldn't make a design decision based on this.

  1. Searching, sorting and filtering these parameters would require API users to understand the underlying structure that these parameters are stored. Changing this down the line could break their requests. There may be a way to create some limitations here and handle the underlying structure ourselves.

Here we could do something like the saved objects API does where it requires attributes in the filter but then replaces it with the saved object type. We could do something like that for params.value instead of params.{alertTypeId}.value. But this may prevent filtering across different alert types. But we could also expand it into a series of OR statements.

  1. The saved objects service would need to support dynamic mappings (see: https://github.com/elastic/kibana/issues/88988). In my POC, I have an approach the core team is happy with.

I have something worked out with the core team that would solve this problem and seems in line with what other teams ask for. This approach will also work for us to support migrations per alert type.

Next steps

I will set up a design meeting to ensure the @elastic/kibana-alerting-services stands by this proposal before starting implementation. After, I will circle back with the Security and Observability teams who requested this to confirm the approach solves what they're asking for.

mikecote commented 3 years ago

I spoke with @romseygeek from the ES Search team and came to the same conclusion of my proposal where having a type associated with each param so we can build mappings is the best way forward for us. 👍

mikecote commented 3 years ago

After a design discussion and further iterations on a POC, @ymao1 and I have something ready for approval. The underlying fundamentals have changed. If you're interested, feel free to read on. Otherwise, @spong @sqren, we'll be in touch soon to get your 👍 before making a PR to master.

@elastic/kibana-alerting-services it was hard to find time to do a follow-up session, so I'm opting for async to keep momentum on this issue. The details are below.

TL;DR

"params": {
  "dynamic": false,
  "type": "object",
  "properties": {
    // From .index-threshold
    "aggType": {
      "type": "keyword"
    },
    // From siem.rules
    "riskScore": {
      "type": "number",
      "ignore_malformed": true
    }
  }
},

Approach

The approach we have settled on is Static mapping + flattened object from @gmmorris proposal here: https://github.com/elastic/kibana/issues/50213#issuecomment-626775254. We've done many iterations to make alert parameters searchable, sortable and filterable across different alert types. We felt tying our underlying data structure to the user's KQL filters didn't seem right. It also didn't work well to sort/search/filter across alert types.

There are some decent good sides going with this approach:

Relatively, there are not as many downsides going with this approach:

Answering the earlier concerns

What do we do what two AlertTypes use the same name for a field? We will throw an error if the mappings differ. If they are the same, it would be ok. This allows us to move forward while having this limitation for now.

How do we handle migrations within a specific AlertType? And what about across all types? Migrations will behave the same as we have now since the data structure didn't change. Migration by alert type coming soon.

What happens if the shape is wrong? Do we validate ourselves? Rely on plugins? The usage of ignore_malformed: true and dynamic: false will protect us in this case. We will also limit the number of fields each alert type can map so we're not merging full schemas.

If this means dynamically merging mappings and types on the "way into the framework" and then exposing it as "portions" of this type on the "way out of the framework" back to the solution - are we introducing a lot of complexity that will be hard to maintain in the future? Since the data structure didn't change, things going in and out will remain the same. Though only a portion of the fields will be mapped and searchable.

Future thinking

In the future, it seems best to create an Elasticsearch index per alert type. This will allow cross-index queries, sorting, and filtering properly while letting Elasticsearch handle issues when mappings differ between indices at search time. The queries would be done the same way as the approach indicated above, and it feels like the approach we should do when writing alerts as data (index per alert type). This approach brings an implementation similar to where we want to be while also addressing the problem for alerts as data.

mikecote commented 3 years ago

I had a quick chat with @kobelb about the approach, one protection worth investigating is when an alert type doesn't use searchable params but ends up being searchable because another alert type with similar fields that defined the mappings. It could be worth adding some protections to the _find API.

sorenlouv commented 3 years ago

What do we do what two AlertTypes use the same name for a field? We will throw an error if the mappings differ. If they are the same, it would be ok. This allows us to move forward while having this limitation for now.

Are we catching this during a build step (CI) or is there a risk that these issues will go undiscovered until end users notify us?

In the future, it seems best to create an Elasticsearch index per alert type

+1

mikecote commented 3 years ago

@sqren it would be done during a build step to notify ASAP when something is wrong.