Open jonathan-buttner opened 3 years ago
@nik9000 @javanna @jimczi wanted to explain a little more about our use case for collapse
on runtime fields. I'm happy to do a follow up zoom or explain more if there are areas that don't make sense.
Pinging @elastic/es-search (Team:Search)
I think it's mostly an oversight that runtime fields don't support field collapsing. It might be slow, but when you use runtime fields you accept slow things.
I know this issue has been here for a while, but it doesn't seem to have any progress. My company's use case would be solved neatly by a runtime field in conjunction with collapse. Are there any plans to implement this?
At the very least the documentation should be updated. A quote from the docs:
You access runtime fields from the search API like any other field, and Elasticsearch sees runtime fields no differently.
This clearly isn't the case, and my case caused quite a bit of wasted time trying to implement this before realizing collapse is not supported. This is made worse by the error that you get if you try this. Using an example from the docs, adding a collapse:
GET my_index/_search
{
"runtime_mappings": {
"day_of_week": {
"type": "keyword",
"script": {
"source": "emit(doc['@timestamp'].value.dayOfWeekEnum.getDisplayName(TextStyle.FULL, Locale.ROOT))"
}
}
},
"collapse": {
"field": "day_of_week"
},
"query": {
"match_all": {}
}
}
produces an error saying collapse is not supported for the field [day_of_week] of the type [keyword]
. This makes it seem like collapse doesn't support keyword types, which is not the case. Going from this error to realizing that it means "collapse is not supported for the runtime field ..." is not obvious, provided the docs say the opposite.
Just want to let you know that in our use-case we too miss collapse
on runtime fields
.
It is strange (and one would not expect so) that it works well with cardinality
but not with collapse
feature.
It seems this is still an issue and I can relate to what @bsamseth commented, had the same problem trying to use it and hitting a dead end with that error only to realize it is a runtime field problem.
Pinging @elastic/es-search-foundations (Team:Search Foundations)
Summary
The Security Threat Hunting team has a use case for collapsing on runtime fields. In this issue I'll try to describe this use case and why using a terms aggregation will not provide a solution.
Throughout this issue I'll reference fields like
process.entity_id
(keyword) andprocess.Ext.ancestry
(an array of keywords in a specific order). These fields come from the elastic endpoint data source. Our goal is to leverage runtime fields to allow users to leverage our tool with custom data sources.TLDR
collapse
functionality in conjunction with runtime fields in search requestsBackground
Details
Our team is building a tool to allow analysts to visualize relationships between events from a data source. Our first use case was to allow a process tree to be visualized. Below is an example of the visualizationAnalyze Event Tool
![image](https://user-images.githubusercontent.com/56361221/102378716-1847cb00-3f94-11eb-8415-28b0a99eb5ee.png)Example Simple Graph
![resolver_tree_children_simple](https://user-images.githubusercontent.com/56361221/102380007-6f9a6b00-3f95-11eb-8467-241a11a29cf1.png)Example Simple Graph
![resolver_tree_children_simple](https://user-images.githubusercontent.com/56361221/102380007-6f9a6b00-3f95-11eb-8467-241a11a29cf1.png)The issue with using a terms aggregation
The elastic endpoint creates specific events to describe the different stages of a process (started, stopped, already running, exec'ed). Because of this we'd like to collapse on the
process.entity_id
to avoid retrieving multiple documents perprocess.entity_id
. A terms aggregation can be used for this but we'd also like to sort the results in breadth-first order. This can be accomplished by using a query like this:Terms agg BFS
``` POST logs-*/_search { "size": 0, "query": { "bool": { "filter": [ { "term": { "process.Ext.ancestry": "9tw2j9fryf" } }, { "term": { "event.category": "process" } }, { "term": { "event.kind": "event" } } ] } }, "aggs": { "by_entity_id": { "terms": { "field": "process.entity_id", "size": 100, "order": { "bfs_sort": "asc" } }, "aggs": { "top_children": { "top_hits": { "_source": ["process.Ext.ancestry", "process.entity_id", "process.parent.entity_id"], "size": 1, "sort": [ { "@timestamp": { "order": "asc" } } ] } }, "bfs_sort": { "max": { "script": { "source": """ Map ancestry = [:]; int length = params._source.process.Ext.ancestry.length; List sourceAncestryArray = params._source.process.Ext.ancestry; for (int i = 0; i < length; i++) { ancestry[sourceAncestryArray[i]] = i; } for (String id : params.ids) { def index = ancestry[id]; if (index != null) { return index; } } return -1; """, "params": { "ids": ["yo", "9tw2j9fryf"] } } } } } } } } ```The script that is used in the
bfs_sort
calculates how far removed each descendant is from the requested node which effectively groups the documents by level.In our testing, we found that if the size field for the terms aggregation was less than the total number of documents, the terms aggregation would fail to return certain nodes, or entire levels in the response. This issue is describe in the docs I believe: https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-terms-aggregation.html#search-aggregations-bucket-terms-aggregation-size
Collapsing on runtime fields
Currently to get around the terms aggregation size issue we
collapse
in our Elasticsearch requests on theprocess.entity_id
field (or a runtime field specified by the user). We then use a script to sort the results in BFS order. Something like this:Collapse query example
```typescript { _source: false, docvalue_fields: this.docValueFields, size, collapse: { // this.schema.id is process.entity_id or a field that the user chooses field: this.schema.id, }, sort: [ { _script: { type: 'number', script: { /** * This script is used to sort the returned documents in a breadth first order so that we return all of * a single level of nodes before returning the next level of nodes. This is needed because using the * ancestry array could result in the search going deep before going wide depending on when the nodes * spawned their children. If a node spawns a child before it's sibling is spawned then the child would * be found before the sibling because by default the sort was on timestamp ascending. */ source: ` Map ancestryToIndex = [:]; List sourceAncestryArray = params._source.${ancestryField}; int length = sourceAncestryArray.length; for (int i = 0; i < length; i++) { ancestryToIndex[sourceAncestryArray[i]] = i; } for (String id : params.ids) { def index = ancestryToIndex[id]; if (index != null) { return index; } } return -1; `, params: { // nodes are the requested nodes of interest to find descendants for ids: nodes, }, }, }, }, { '@timestamp': 'asc' }, ], ... } ``` https://github.com/elastic/kibana/blob/master/x-pack/plugins/security_solution/server/endpoint/routes/resolver/tree/queries/descendants.ts#L87Not sure if it makes a difference but we are not using (and don't have plans to) the
inner_hits
functionality ofcollapse
.