Open jordanlewis opened 5 years ago
@jordanlewis whats your status on the case operator? I feel like the strategy for repeating the batch as done there to keep adding to a selection vector could be good here.
Status is that it's kinda on hold til the FK stuff gets finished, and hoping to get some more reviews on it too. I saw your review (thanks!) but it's kinda complicated and I'd appreciate some more scrutiny :)
I agree though that your proposed strategy could work. It actually might be as simple as planning a selection right before the aggregation step actually runs.
yeah, thats what I was thinking. if we just select out the rows that don't pass the filter then we might be good. No worries about my review -- i don't have much authority over the codebase yet.
One edge case is what happens when the whole group is filtered out? We still need to let aggregate function populate the output for such group, otherwise there will be a misalignment between any_not_null
and all other functions. I think if we don't have any FILTER clauses, handling DISTINCT is relatively easy, but FILTER clause makes it extremely hard to not plumb the logic into each aggregate function.
It's been a while since I struggled with that (but I did spend like a day trying to get it working in all cases), and here is a snippet of what I had in the end. I can't easily describe why I came to realization that there is no way around plumbing the FILTERing logic into the aggregate function because I forgot and lost all the context, but at the time I was certain of it.
In the code snippet I was trying to modify the groups
vector so that FILTERing would work in all cases (another case that comes to mind: how do you handle the scenario when the "head" of the group is filtered out? (the "head" is the first tuple in the group for which groups[i]==true
)).
We have marked this issue as stale because it has been inactive for 18 months. If this issue is still relevant, removing the stale label or adding a comment will keep it active. Otherwise, we'll close it in 10 days to keep the issue queue tidy. Thank you for your contribution to CockroachDB!
Add filtering aggregator support to the vectorized engine. This is an extra filter step that only includes rows in the aggregation if they pass a filter.
Jira issue: CRDB-5586