tomjaguarpaw / haskell-opaleye

Other
602 stars 115 forks source link

Add support for Postgres' `FILTER (WHERE ...)` syntax in aggregations #564

Closed shane-circuithub closed 1 year ago

tomjaguarpaw commented 1 year ago

Looks great, thanks! And I appreciate you breaking off the commit that adds the new data type. This is marked as "draft" but is it ready to be merged?

I left a couple of suggestions, but I'm happy to do them myself.

shane-circuithub commented 1 year ago

Hey! I marked it as a draft because I'm still testing this with Rel8. In particular, the case that's tricky to deal with (and I think this will be a problem for Opaleye too) is the case where the filter returns false for all rows. That reintroduces the problem where Postgres returns null instead of the identity of the aggregation function (e.g., {} for array_agg or 0 for sum). I believe you sidestep this issue for aggregations in general with the GROUP BY COALESCE(0) trick but I don't see an obvious way to apply something like that here. So I'm exploring possible solutions to that problem.

tomjaguarpaw commented 1 year ago

Ah, OK. Well, for non-nullable types we could just filter out the nulls afterwards. A technique that would work for nullable types would be to do another aggregation of sum(0) (or similar) and filter out the null on that.

shane-circuithub commented 1 year ago

I've changed filterWhere to return a MaybeFields now. I think that's the best we can do right now. I don't think we want to filter out rows that would return nulls entirely, because you might have different filters conditions on different columns in the same aggregation and you might want to be able to observe the different results — but if one of those filters returns false for everything then you get a null and then if we filter out nulls then all the information is wiped out.

tomjaguarpaw commented 1 year ago

Oh, good point! OK, let me think about it a little more.

tomjaguarpaw commented 1 year ago

I believe we can still do my sum(0) trick but instead of using it to filter the result table (which as you say, doesn't make much sense), use it to determine whether to return a nothingFields or justFields. That is, return nothingFields if the filter results in zero rows and justFields if it results in a strictly positive number of rows. What do you think?

shane-circuithub commented 1 year ago

That's pretty much what I'm doing in this PR, except instead of sum(0) I'm using bool_and(true). I then directly use the result of this as the tag for the MaybeFields.

tomjaguarpaw commented 1 year ago

Ah great! I didn't look closely enough at what you were doing then.

tomjaguarpaw commented 1 year ago

[FYI the CI failure here is just a flake]

tomjaguarpaw commented 1 year ago

Sorry, still haven't got around to looking at this, but it is quite high up my TODO list.

tomjaguarpaw commented 1 year ago

I've split this into smaller commits at https://github.com/tomjaguarpaw/haskell-opaleye/pull/565. It looks good! It still needs at least a basic test, but other than that I think it's good to go. I've lost energy for the day but will pick it back up another time.

tomjaguarpaw commented 1 year ago

Completed by https://github.com/tomjaguarpaw/haskell-opaleye/pull/565