Open willkg opened 12 years ago
On IRC we discussed some alternatives:
First, add a type
argument to .facet()
that changes the facet type it's creating. Then we'd end up with code like this:
s.facet('fielda', 'fieldb') # terms facet
s.facet('fielda', type='date_histogram', interval='days') # date histogram facet
This is harder to document and we know at least "interval" is an argument that affects multiple facet types, but has different shapes of values. Ew.
Second, we could have a Facet
class with subclasses like DateHistogram
where you provide the relevant bits:
s.facet(DateHistogram('fielda', interval='days'))
This is easy to document. We have F
, so there's some precedence for this shape of things.
Third, we could have different methods:
s.facet_date_histogram('fielda', interval='days')
This is easy to document. We do different filter types as different methods, so there's precedence for this, too.
Oops--the bit about having different filter types as different methods is bogus. We only have one .filter
.
If we look at doing multiple facet types, then the second approach looks better. Then you could do something like this:
s.facet(TermsFacet('fielda', 'fieldb'), Histogram('fielda', interval=100), DateHistogram('fielda', interval='days'))
I don't know how often that comes up in practice, but it nicely mirrors the resulting ES output:
{
"query" : {
"match_all" : { }
},
"facets" : {
"tag" : {
"terms" : {
"field" : "tag",
"size" : 10
}
},
"histo1" : {
"histogram" : {
"field" : "field_name",
"interval" : 100
}
},
"histo2" : {
"date_histogram" : {
"field" : "field_name",
"interval" : "day"
}
}
}
}
Also, it's interesting to point out that the value of facets
is a dict of facet names to facets.
Maybe we should make it like this instead?:
s.facet(facetname1=DateHistogram('fielda', interval='days'))
I like that last idea best so far. It's a little wordy, but it's closer to the ES api and it's explicit in important ways.
I'll try implementing that tomorrow morning (or late tonight--whichever has more free time) and see whether it tastes yucky or not.
The last comment is from 8 months ago. Since then, we reimplemented filters and queries in an extensible way. Facets should follow suit.
Continuing from #146
I agree it's not a great idea, if for no other reason than it gets away practices set forth in django.
I'm not sure where you guys are at in this discussion, but I could see this going a could of different ways:
1: Using something like Django's aggregation framework, so something like from elasticutils.facets import Term, Histogram, Statistical, etc
And then from there do something like, S().query(lol='cats').facets(Statistical("field", options))
Or, have an options meta class that is passed in when declaring a facet, so something like this:
options = Options(options)
`S().query(lol='cats').facets(field__statistical=options)
I probably prefer the former to the later, keeps it closer to django and easier for new devs to pick up I think.
The thing is, different facet types have different options available. Those options should get exposed via the API and enforced as well.
Thoughts?
Currently
.facet()
only does terms facet. ElasticSearch supports other facet types:We want to use date histogram in Input, so we have a current need for implementing that. @rlr expressed a deep-seated yearning for histogram as well.
It'd be nice if ElasticUtils supported more facet types.