Open stfaun opened 4 years ago
Considering the above problem, implementing a top_one
metrics aggregation as a SingleValue
metrics aggregation may be a alternative solution.
Like the top_hits
and top_metrics
metrics aggregations, new top_one
metrics aggregation needs a sort
parameter to determine how to sort the docs in a bucket.
UnLike the top_hits
and top_metrics
metrics aggregations, new top_one
metrics aggregation should be a SingleValue
metrics aggregation. Therefore, new top_one
metrics aggregation can only extract one field of the first doc which are sorted by the specified sort
fields.
Also, new top_one
metrics aggregation provide a parameter ignore_null
to determine if the null value of target field should be ignored.
New top_one
metrics aggregation maybe used as follows:
GET /exams/_search
{
"size": 0,
"aggs": {
"first_grade": {
"top_one": {
"value": {
"field": "grade"
},
"sort": {
"timestamp": "asc"
},
"ignore_null": true
}
}
}
}
Which yields a response like:
{
...
"aggregations": {
"first_grade": {
"value": 70.0
}
}
}
Because new top_one
metrics aggregation is a SingleValue
metrics aggregation, its result can be used in the bucket_path
of bucket_script
/bucket_selector
/bucket_sort
.
I would like to implement this feature, but I'm Not sure if it's a good design to implement the top_one metrics aggregation. Or the previos first/last metrics aggregation may be better?
Pinging @elastic/es-analytics-geo (:Analytics/Aggregations)
Hiya @stfaun, thanks for opening this issue! I'm going to mark this as team-discuss, so that the analytics team can chat about it. I think it's an interesting use-case, but I'm personally not sure if it'd be better served as a modification to top-metrics
(some kind of flag or mode when only one field is needed?) or as a whole new agg as you suggest. Both approaches have pros/cons.
Will write back once we've discussed!
Hiya @stfaun, we chatted about this and were curious if a filter aggregation + exists query would solve your needs?
The filter aggregation will ensure that all documents inside the bucket match the provided query/filter, and the exists query can be used to ensure that all documents have the desired field (so that the "top one" document doesn't have a null value). You can then use the top-metrics
agg specifying a single field, and you should get the "top" doc that has a non-null value.
Hi @polyfractal, the solution you suggests does work for me. I have been confirm that it can be use in the bucket_path
of bucket_script
/bucket_selector
/bucket_sort
.
Thanks for your replies.
Relates to #35639
At the beginning, I think new
first
/last
metric aggregations should be implemented as aSingleValue
metrics aggregation to support the feature.But the
first
/last
metrics aggregations seem to be redundant. They can be combined as atop
metrics aggregation.I have found that
top_metrics
metrics aggregations may be appropriate for my requirement. But when the first sorted document has no value for the target field, thetop_metrics
metrics aggregations will return null value as the result rather than ignoring it.I understand the feature for
top_metrics
metrics aggregations. Atop_metrics
metrics aggregations may return several fields at the same time. So it should not ignore any doc for the bucket.