apache / datafusion

Apache DataFusion SQL Query Engine
https://datafusion.apache.org/
Apache License 2.0
6.31k stars 1.19k forks source link

Documentation for running benchmarks with simd support does not work for me #1577

Closed andygrove closed 2 weeks ago

andygrove commented 2 years ago

Describe the bug I tried running this command from https://github.com/apache/arrow-datafusion/tree/master/benchmarks

cargo run --release --features "simd mimalloc" --bin tpch -- benchmark datafusion --iterations 1 --path /mnt/bigdata/tpch/sf100-24part-parquet --format parquet --query 1 --batch-size 4096

It failed with:

$ cargo run --release --features "simd mimalloc" --bin tpch -- benchmark datafusion --iterations 1 --path /mnt/bigdata/tpch/sf100-24part-parquet --format parquet --query 1 --batch-size 4096
   Compiling libm v0.1.4
   Compiling libmimalloc-sys v0.1.22
   Compiling packed_simd_2 v0.3.5
error[E0557]: feature has been removed
   --> /home/andy/.cargo/registry/src/github.com-1ecc6299db9ec823/packed_simd_2-0.3.5/src/lib.rs:215:5
    |
215 |     const_generics,
    |     ^^^^^^^^^^^^^^ feature has been removed
    |
    = note: removed in favor of `#![feature(adt_const_params)]` and `#![feature(generic_const_exprs)]`

It works fine if I do not specify features.

To Reproduce See above.

Expected behavior Should not fail to compile.

Additional context I am using rustc 1.58.0 (02072b482 2022-01-11)

Jefffrey commented 2 years ago

Looks to be fixed by https://github.com/rust-lang/packed_simd/commit/45d5347a0d2187c046a546a477d2a53111cd7713

Which was released in ver 0.3.6

Version is currently 0.3.8

jeffrey:~/Code/arrow-datafusion/benchmarks$ cargo tree --features "simd mimalloc" | grep simd
│   ├── packed_simd_2 v0.3.8

And error not reproducible:

jeffrey:~/Code/arrow-datafusion/benchmarks$ cargo run --release --features "simd mimalloc" --bin tpch -- benchmark datafusion --iterations 1 --path /home/jeffrey/tmpdata --format parquet --query 1 --batch-size 4096
    Finished release [optimized] target(s) in 0.09s
     Running `/home/jeffrey/Code/arrow-datafusion/target/release/tpch benchmark datafusion --iterations 1 --path /home/jeffrey/tmpdata --format parquet --query 1 --batch-size 4096`
Running benchmarks with the following options: DataFusionBenchmarkOpt { query: 1, debug: false, iterations: 1, partitions: 2, batch_size: 4096, path: "/home/jeffrey/tmpdata", file_format: "parquet", mem_table: false, output_path: None, disable_statistics: false }
Query 1 iteration 0 took 589.6 ms and returned 4 rows
Query 1 avg time: 589.63 ms

(rustc 1.66.0-nightly (1898c34e9 2022-10-26))

drauschenbach commented 2 weeks ago

Obsolete now that simd is no longer a feature.

alamb commented 2 weeks ago

Thanks @drauschenbach