apache / datafusion-ballista

Apache DataFusion Ballista Distributed Query Engine
https://datafusion.apache.org/ballista
Apache License 2.0
1.56k stars 197 forks source link

Invalid argument error: lz4 IPC decompression requires the lz4 feature #929

Open andygrove opened 11 months ago

andygrove commented 11 months ago

Describe the bug

I am trying to run queries with Ballista but I get this error on the client.

Fail: Arrow error: Invalid argument error: lz4 IPC decompression requires the lz4 feature

I am not yet sure where this is coming from.

I do not see any lz4 feature in the Ballista Cargo.toml files.

To Reproduce

Expected behavior

Additional context

andygrove commented 11 months ago

@Dandandan Any suggestions on tracking this down?

Dandandan commented 11 months ago

You can enable lz4 compression support via ipc_compression feature flag of arrow.

Dandandan commented 11 months ago

Hm but this seems to be currently enabled, are you somehow running a version without ipc_compression set?

andygrove commented 11 months ago

I built Ballista from this PR with cargo build --release.

My client is at https://github.com/sql-benchmarks/sqlbench-runners/pull/32 and has these dependencies:

[dependencies]
ballista = { git = "https://github.com/andygrove/arrow-ballista", branch="df-34" }
datafusion = { git = "https://github.com/apache/arrow-datafusion", rev = "d091b55be6a4ce552023ef162b5d081136d3ff6d" }

It has been a long time since I worked on this project so maybe I am just missing something obvious.

Dandandan commented 11 months ago

Is the branch compiling already? Seems it might not and you might be running an older (cached) version maybe? https://github.com/andygrove/arrow-ballista/actions/runs/7169227281/job/19519719152#step:8:443

andygrove commented 11 months ago

Is the branch compiling already?

Yes, it was just the tests that weren't compiling in CI. I pushed a fix. I ran a cargo clean locally on ballista and my client and still see the same issue.

Dandandan commented 11 months ago

Is the branch compiling already?

Yes, it was just the tests that weren't compiling in CI. I pushed a fix. I ran a cargo clean locally on ballista and my client and still see the same issue.

Does enabling it in the sqlbenchrunner as well by explicitly adding arrow dependency with ipc_compression work?

andygrove commented 11 months ago

Does enabling it in the sqlbenchrunner as well by explicitly adding arrow dependency with ipc_compression work?

Yes, that fixed it, thanks. I can create a PR to add this to the documentation, but it would be nice if this weren't needed. Shouldn't Ballista enable this feature in the arrow dependency by default?