datafusion-contrib / datafusion-python

Python binding for DataFusion
https://arrow.apache.org/datafusion/python/index.html
Apache License 2.0
59 stars 12 forks source link

Exception: DataFusion error: Plan("Invalid function 'mean'") #56

Closed MrPowers closed 2 years ago

MrPowers commented 2 years ago

Here's the code I'm trying to run:

import datafusion

ctx = datafusion.SessionContext()

ctx.register_csv("x", path)

ctx.sql("select id3, sum(v1) as v1, mean(v3) as v3 from x group by id3").collect()

It's giving me the following error: return ctx.sql("select id3, sum(v1) as v1, mean(v3) as v3 from x group by id3").collect() Exception: DataFusion error: Plan("Invalid function 'mean'")

Other queries like this one are running without issue:

ctx.sql("select id1, sum(v1) as v1 from x group by id1").collect()

I am using DataFusion v0.6.0.

Am I missing an import? Or are these functions not defined with the Python bindings yet? Looks like this query is running fine with the Rust version, see this PR by @andygrove. Thanks for the help!

andygrove commented 2 years ago

We have avg but not mean

MrPowers commented 2 years ago

Yep, that works, thanks 😄

andygrove commented 2 years ago

Here's a PR to upgrade the Python bindings to use DataFusion 10.0.0

https://github.com/datafusion-contrib/datafusion-python/pull/57

On Fri, Jul 15, 2022 at 3:34 PM Matthew Powers @.***> wrote:

Yep, that works, thanks 😄

— Reply to this email directly, view it on GitHub https://github.com/datafusion-contrib/datafusion-python/issues/56#issuecomment-1185963618, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAHEBRGMHUVAWOTU3XZJLU3VUHKP5ANCNFSM53WVO7SQ . You are receiving this because you were mentioned.Message ID: @.***>