Closed max-moser closed 1 year ago
:warning: As per discussion with @slint (discord chat here), this feature needs a bit more discussion regarding which use cases we expect in the future & how we want to design the data structure s.t. we're not blocking ourselves in.
As we've discussed, we couldn't come up with immediate use-cases that would actually depend on this feature being merged in. We only found a few nice-to-have future use cases (show from which countries the records/files were downloaded), but they would need to be fleshed out some more.
The tricky part here is that aggregations are only an "intermediate" result, because the queries still need some way of aggregating the aggregations (events
-> aggregations
-> queries
).
As such, the aggregations need to be aggregatable themselves.
:woozy_face:
This PR adds support for bucket aggregations as cousins to the already supported metric aggregations. Its main purpose is to enable simple document counts based on keywords ("how often does each value for a key occur in all events"), e.g. for the boolean
via_api
flag forrecord-view
events:reference payload:
Limitations: 1) The bucket aggregations here just add "flat" (keyword count) information to the resulting aggregation, but can't be used to do nested bucketing. That would probably increase complexity a lot and isn't required right now. 2) For now, only
terms
-type bucket aggregations are supported.Discussion:
v1:
v2:
Use cases: 1) requests on zenodo support: from which countries was the record viewed/downloaded 2)
Charts: https://github.com/inveniosoftware/invenio-stats/issues/120