apache / datafusion-ballista

Apache DataFusion Ballista Distributed Query Engine
https://datafusion.apache.org/ballista
Apache License 2.0
1.39k stars 181 forks source link

Invalid argument error: lz4 IPC decompression requires the lz4 feature #929

Open andygrove opened 6 months ago

andygrove commented 6 months ago

Describe the bug

I am trying to run queries with Ballista but I get this error on the client.

Fail: Arrow error: Invalid argument error: lz4 IPC decompression requires the lz4 feature

I am not yet sure where this is coming from.

I do not see any lz4 feature in the Ballista Cargo.toml files.

To Reproduce

Expected behavior

Additional context

andygrove commented 6 months ago

@Dandandan Any suggestions on tracking this down?

Dandandan commented 6 months ago

You can enable lz4 compression support via ipc_compression feature flag of arrow.

Dandandan commented 6 months ago

Hm but this seems to be currently enabled, are you somehow running a version without ipc_compression set?

andygrove commented 6 months ago

I built Ballista from this PR with cargo build --release.

My client is at https://github.com/sql-benchmarks/sqlbench-runners/pull/32 and has these dependencies:

[dependencies]
ballista = { git = "https://github.com/andygrove/arrow-ballista", branch="df-34" }
datafusion = { git = "https://github.com/apache/arrow-datafusion", rev = "d091b55be6a4ce552023ef162b5d081136d3ff6d" }

It has been a long time since I worked on this project so maybe I am just missing something obvious.

Dandandan commented 6 months ago

Is the branch compiling already? Seems it might not and you might be running an older (cached) version maybe? https://github.com/andygrove/arrow-ballista/actions/runs/7169227281/job/19519719152#step:8:443

andygrove commented 6 months ago

Is the branch compiling already?

Yes, it was just the tests that weren't compiling in CI. I pushed a fix. I ran a cargo clean locally on ballista and my client and still see the same issue.

Dandandan commented 6 months ago

Is the branch compiling already?

Yes, it was just the tests that weren't compiling in CI. I pushed a fix. I ran a cargo clean locally on ballista and my client and still see the same issue.

Does enabling it in the sqlbenchrunner as well by explicitly adding arrow dependency with ipc_compression work?

andygrove commented 6 months ago

Does enabling it in the sqlbenchrunner as well by explicitly adding arrow dependency with ipc_compression work?

Yes, that fixed it, thanks. I can create a PR to add this to the documentation, but it would be nice if this weren't needed. Shouldn't Ballista enable this feature in the arrow dependency by default?