Closed rudolf closed 2 years ago
Pinging @elastic/kibana-core (Team:Core)
Would it possible to refer to the search.max_buckets
constant instead? If tomorrow we have 201 saved object types we might just inadvertently miss one of them.
Since the number of SO types should not grow like crazy I imagine we should be fine specifying a large size:
. IDK if there's a performance penalty though.
yeah I agree that we would want some type of assurance that we're not missing data.
We could use search.max_buckets
which defaults to 10k, but like you say there is a performance risk. We could either test the performance with 10k types (and/or speak to the Elasticsearch team) to mitigate the risk or defer this to the future and artificially limit it to a smaller number.
Another variation on afharo's suggestion of a CI test would be to have a development-only assertion that crashes the server if there are > 0 results in the other
bucket. In production 3rd party plugins could register more types so we might see other
showing up in our telemetry.
Not sure what's the best approach, but for telemetry we should be careful of something that could have a big performance impact, so I'd lean towards a lower value with some kind of way to detect that we're approaching the limit.
After some internal discussion (please @rudolf keep me honest here), our ideal report would include:
others
bucket -> This would show if any third-party plugins are registering their own SOs.total
with the total count of SOs.The problem with the current terms
aggregation is that it works in a Top-N fashion. This means that if a third-party plugin registers a new type of SO and it's very popular, it will show up in the per_type
list and it will push other built-in types to the others
bucket.
IMO, if we want the built-in types to always show in the detailed per_type
view, and the third-party ones to always fall in the others
bucket, we may need to change the aggregation.
I can think of 2 ways of doing this:
filters
aggregation: we control each bucket by explicitly defining all the buckets for the known built-in SO types.terms
aggregation but sort the values using the _score
: we need to list in the query all the built-in SO types so they score over non-built-in ones, and show up on top of the terms
aggregations.
GET .kibana/_search
{
"size": 0,
"track_total_hits": true,
"query": {
"bool": {
"must": [{ "match_all": {} }],
// When there's a must, should is used to score higher (not as an additional filter). We need to list here all the built-in types
"should": [{ "terms": { "type": [ "map", "VALUE2" ] } }]
}
},
"aggs": {
"types": {
"terms": {
"field": "type",
"order": { "max_score": "desc" },
"size": 200 // The size will match the length of the array in `should`
},
"aggs": {
// We cannot reach to the _score in the terms aggregations and need this extra aggs to bring it up.
"max_score": { "max": { "script": "_score" } }
}
}
}
}
The problem with that approach is that we need to keep a list of all the built-in saved objects and add tests to make sure that folks keep the list updated the moment they register new types.
What do you think?
I created #135837 based on the 2nd approach (terms
aggregation sorted by _score
) + others
bucket.
It adds a Jest Integration Test that compares the list of SOs and the currently registered ones. If they don't match, it'll fail. Hopefully, it doesn't add too much burden to folks adding/removing SOs.
The usage collector for saved objects per type uses an terms aggregation without specifying the size which causes it to default to 10 buckets. That means we're only getting the top 10 buckets on each cluster. This is probably sufficient for Core to track which saved object types are most likely to cause scalability problems for users. But it's not enough for teams wanting to use this for telemetry so it probably also blocks us from removing the kibana collector https://github.com/elastic/kibana/issues/99464
https://github.com/elastic/kibana/blob/main/src/plugins/kibana_usage_collection/server/collectors/saved_objects_counts/get_saved_object_counts.ts#L11-L31
Instead I think we should rather just aim to collect all saved object types by using a large size such as
200
Related: https://github.com/elastic/kibana/pull/99344