quickwit-oss / tantivy-py

Python bindings for Tantivy
MIT License
274 stars 63 forks source link

feat: Aggregations API #288

Closed ditsuke closed 3 months ago

ditsuke commented 4 months ago

What

Adds support for aggregations with a Searcher.aggregate method.

Why

At the moment tantivy-py does not support aggregations in any capacity. This PR adds support for all aggregations with an aggregate method on Searcher which should closely match the existing search method. I also started a discussion (ref) on the Quickwit Community Discord earlier this month to get feedback on the API. While there was not much feedback, the general consensus I think is that we should also eventually expose Collector in tantivy-py to allow for more advanced usage and consistency with the Rust API. While I agree with this strongly (part of the reason I started the discussion), I think it is a good idea to start with this simpler API to get aggregation support going like we have with search.

Note that at the moment the aggregations are returned as a dictionary. I'm still considering whether its worth the effort to port all the types over to pyo3 or if there's another better and simpler way. On the other hand, a dict alone might also work just fine. Please feel free to provide feedback on this.

ditsuke commented 4 months ago

Open for review

cjrh commented 4 months ago

@ditsuke There are many formatting changes in this PR. Could you separate the formatting changes into a different PR? Also, how did you do the formatting, is it just with black, or ruff format or something else?

cjrh commented 4 months ago

Also, while not essential, if you could add a small example in the tutorial section of the docs that would be greatly appreciated. It's up to you if you want, or if you prefer to do that in a separate PR, or not at all :)

https://tantivy-py.readthedocs.io/en/latest/tutorials/

ditsuke commented 4 months ago

@cjrh thanks for taking a look. The formatting changes were made by ruff, and while we're on that topic may I suggest deciding on a formatter for the project for consistency? As for now I've reverted those changes (definitely did not belong in this PR). I would like to add the tutorial in a follow up as well.

ditsuke commented 3 months ago

Thanks for merging!