This PR makes a number of changes to how aggregation works in Rel8.
The biggest change is that we drop the Aggregate context and we return to the Profunctor-based Aggregator that Opaleye uses (as in #37). While working with Profunctors is more awkward for many common use-cases, it's ultimately more powerful. The big thing it gives you that we don't currently have is the ability to "post-map" on the result of an aggregation function. Pretend for a moment that Postgres does not have the avg function built-in. With the previous Rel8, there is no way to directly write sum(x) / count(x). The best you could do would something like:
fmap (\(total, size) -> total / fromIntegral size) $ aggregate $ do
foo <- each fooSchema
pure (sum foo.x, count foo.x)
The key thing is that the mapping can only happen after aggregate is called. Whereas with the Profunctor-based Aggregator this is just (/) <$> sum <*> fmap fromIntegral count. This isn't too bad if the only thing you want to do is computing the average, but if you're doing a complicated aggregation with several things happening at once then you might need to do several unrelated post-processings after the aggregate. We really want a way to bundle up the postmapping with the aggregation itself and have that as a singular composable unit. Another example is the listAggExpr function. The only reason Rel8 exports this is because it can't be directly expressed in terms of listAgg. With the Profunctor-based Aggregator it can be, it's just (id $*) <$> listAgg, it no longer needs to be a special case.
The original attempt in #37 recognised that it can be awkward to have to write lmap (.x) sum, so instead of sum having the type signature Aggregator (Expr a) (Expr a), it had the type signature (i -> Expr a) -> Aggregator i (Expr a), so that you wouldn't have to use lmap, you could just type sum (.x). However, there are many ways to compose Aggregators — for example, if you wanted to use combinators from product-profunctor to combine aggregators, then you'd rather type sum ***! count than sum id ***! count id. So in this PR we keep the type of sum as Aggregator (Expr a) (Expr a), but we also export sumOn, which has the bundled lmap.
The other major change is that this PR introduces two forms of aggregation — "semi"-aggregation and "full"-aggregation. Up until now, all aggregation in Rel8 was "semi"-aggregation, but "full"-aggregation feels a bit more natural and Haskelly.
Up until now, the aggrgegate combinator in Rel8 would return zero rows if given a query that itself returned zero rows, even if the aggregation functions that comprised it had identity values. So it was very common to see code like fmap (fromMaybeTable 0) $ optional $ aggregate $ sum <$> _. Again, we "know" that 0 is the identity value for sum and we really want some way to bundle those together and to say "return the identity value if there are zero rows". Rel8 now has this ability — it has both Aggregator and Aggregator1, with the former having identity values and the latter not. The aggregate function now takes an Aggregator and returns the identity value when encountering zero rows, whereas the aggregate1 function takes an Aggregator1 and behaves as before. count, sum, and, or, listAgg are Aggregators (with the identity values 0, 0, true, false and listTable [] respectively) and groupBy, max and min are Aggregator1s.
This also means that many is now just aggregate listAgg instead of fmap (fromMaybeTable (listTable [])) . optional . aggregate . fmap listAgg.
It should also be noted that these functions are actually polymorphic — sum will actually give you an Aggregator' that can be used as either Aggregator or Aggregator1 without needing to explicitly convert between them. Similarly aggregate1 can take either an Aggegator or an Aggregator1 (though it won't use the identity value of the former).
Aggregation in Rel8 now supports more of the features of PostgresSQL supports. Three new combinators are introduced — distinctAggregate, filterWhere and orderAggregateBy.
Opaleye itself already supported distinctAggregate and indeed we used this to implement countDistinct as a special case, but we now support using DISTINCT on arbitrary aggregation functions.
filterWhere is new to both Rel8 and Opaleye. It corresponds to PostgreSQL's FILTER (WHERE ...) syntax in aggregations. It also uses the identity value of an Aggregator in the case where the given predicate returns zero rows. There is also filterWhereOptional which can be used with Aggregator1s.
orderAggregateBy allows the values within an aggregation to be ordered using a given ordering, mainly non-commutative aggregation functions like listAgg.
This PR makes a number of changes to how aggregation works in Rel8.
The biggest change is that we drop the
Aggregate
context and we return to theProfunctor
-basedAggregator
that Opaleye uses (as in #37). While working withProfunctor
s is more awkward for many common use-cases, it's ultimately more powerful. The big thing it gives you that we don't currently have is the ability to "post-map" on the result of an aggregation function. Pretend for a moment that Postgres does not have theavg
function built-in. With the previous Rel8, there is no way to directly writesum(x) / count(x)
. The best you could do would something like:The key thing is that the mapping can only happen after
aggregate
is called. Whereas with theProfunctor
-basedAggregator
this is just(/) <$> sum <*> fmap fromIntegral count
. This isn't too bad if the only thing you want to do is computing the average, but if you're doing a complicated aggregation with several things happening at once then you might need to do several unrelated post-processings after theaggregate
. We really want a way to bundle up the postmapping with the aggregation itself and have that as a singular composable unit. Another example is thelistAggExpr
function. The only reason Rel8 exports this is because it can't be directly expressed in terms oflistAgg
. With theProfunctor
-basedAggregator
it can be, it's just(id $*) <$> listAgg
, it no longer needs to be a special case.The original attempt in #37 recognised that it can be awkward to have to write
lmap (.x) sum
, so instead of sum having the type signatureAggregator (Expr a) (Expr a)
, it had the type signature(i -> Expr a) -> Aggregator i (Expr a)
, so that you wouldn't have to uselmap
, you could just typesum (.x)
. However, there are many ways to composeAggregator
s — for example, if you wanted to use combinators fromproduct-profunctor
to combine aggregators, then you'd rather typesum ***! count
thansum id ***! count id
. So in this PR we keep the type ofsum
asAggregator (Expr a) (Expr a)
, but we also exportsumOn
, which has the bundledlmap
.The other major change is that this PR introduces two forms of aggregation — "semi"-aggregation and "full"-aggregation. Up until now, all aggregation in Rel8 was "semi"-aggregation, but "full"-aggregation feels a bit more natural and Haskelly.
Up until now, the
aggrgegate
combinator in Rel8 would return zero rows if given a query that itself returned zero rows, even if the aggregation functions that comprised it had identity values. So it was very common to see code likefmap (fromMaybeTable 0) $ optional $ aggregate $ sum <$> _
. Again, we "know" that0
is the identity value forsum
and we really want some way to bundle those together and to say "return the identity value if there are zero rows". Rel8 now has this ability — it has bothAggregator
andAggregator1
, with the former having identity values and the latter not. Theaggregate
function now takes anAggregator
and returns the identity value when encountering zero rows, whereas theaggregate1
function takes anAggregator1
and behaves as before.count
,sum
,and
,or
,listAgg
areAggregator
s (with the identity values0
,0
,true
,false
andlistTable []
respectively) andgroupBy
,max
andmin
areAggregator1
s.This also means that
many
is now justaggregate listAgg
instead offmap (fromMaybeTable (listTable [])) . optional . aggregate . fmap listAgg
.It should also be noted that these functions are actually polymorphic —
sum
will actually give you anAggregator'
that can be used as eitherAggregator
orAggregator1
without needing to explicitly convert between them. Similarlyaggregate1
can take either anAggegator
or anAggregator1
(though it won't use the identity value of the former).Aggregation in Rel8 now supports more of the features of PostgresSQL supports. Three new combinators are introduced —
distinctAggregate
,filterWhere
andorderAggregateBy
.Opaleye itself already supported
distinctAggregate
and indeed we used this to implementcountDistinct
as a special case, but we now support usingDISTINCT
on arbitrary aggregation functions.filterWhere
is new to both Rel8 and Opaleye. It corresponds to PostgreSQL'sFILTER (WHERE ...)
syntax in aggregations. It also uses the identity value of anAggregator
in the case where the given predicate returns zero rows. There is alsofilterWhereOptional
which can be used withAggregator1
s.orderAggregateBy
allows the values within an aggregation to be ordered using a given ordering, mainly non-commutative aggregation functions likelistAgg
.