prestodb / presto

The official home of the Presto distributed SQL query engine for big data
http://prestodb.io
Apache License 2.0
15.97k stars 5.35k forks source link

Filter null rows before aggregation #23052

Open jackychen718 opened 3 months ago

jackychen718 commented 3 months ago

Description

Filter rows with all fileds null before aggregation.

Motivation and Context

The issue comes from #17486 Add optimiztion to drop groups/rows with all fields null

Impact

Add another opitimization rule: RemoveNullRowInAggregation.java

Test Plan

Test if any error made after applying the new optimization rule.

Contributor checklist

Release Notes

Please follow release notes guidelines and fill in the release notes below.

== RELEASE NOTES ==

General Changes
* ... :pr:`12345`
* ... :pr:`12345`

Hive Connector Changes
* ... :pr:`12345`
* ... :pr:`12345`

== NO RELEASE NOTE ==

jaystarshot commented 3 months ago

Can you add unit tests? Also there must be a feature flag controlling this since I am not sure if this optimization will be always effective. What happens in count(*) case?

jackychen718 commented 3 months ago

Can you add unit tests? Also there must be a feature flag controlling this since I am not sure if this optimization will be always effective. What happens in count(*) case?

I will add unit tests. Currently, I did not consider count(*) as a special way. From your reminding, I do not think I can still add filter before aggregation in this scenario. I will exclude this scenario in the pattern match process. Is it ok? @jaystarshot

tdcmeehan commented 1 month ago

@jackychen718 are you still working on this PR?

jackychen718 commented 2 weeks ago

@tdcmeehan I keep in mind this task is not done yet. After I merge other PRs, I will come back to this one.