apache / datafusion-ballista

Apache DataFusion Ballista Distributed Query Engine
https://datafusion.apache.org/ballista
Apache License 2.0
1.39k stars 181 forks source link

Enable benchmark data validation for distributed execution #454

Open andygrove opened 3 years ago

andygrove commented 3 years ago

Is your feature request related to a problem or challenge? Please describe what you are trying to do. The TPC-H benchmark suite already has a feature for verifying that results are correct when executing in-memory with DataFusion. It would be good to extend this support to distributed execution with Ballista.

Describe the solution you'd like I would like an option to run the benchmark in data validation mode when executing against a Ballista cluster.

Describe alternatives you've considered None

Additional context None

msathis commented 3 years ago

I can take this up ✌️