apache / datafusion

Apache DataFusion SQL Query Engine
https://datafusion.apache.org/
Apache License 2.0
5.54k stars 1.03k forks source link

Introduce `sum_distinct()` function to dataframe #2407

Open WinkerDu opened 2 years ago

WinkerDu commented 2 years ago

Is your feature request related to a problem or challenge? Please describe what you are trying to do. see #2405 , a dataframe interace for sum(distinct) usage

Describe the solution you'd like A clear and concise description of what you want to happen.

Describe alternatives you've considered A clear and concise description of any alternative solutions or features you've considered.

Additional context Add any other context or screenshots about the feature request here.

alamb commented 2 years ago

Perhaps a function like sum_distinct (following the pattern of count_distinct) would be good

https://github.com/apache/arrow-datafusion/blob/c3c02cf5e881c9c5020ad87417716dd932e69a69/datafusion/core/src/logical_plan/mod.rs#L43