-
I cannot get benchmarks running in k8s. I suspect that too many tasks are being scheduled in parallel.
I added resource constraints in the code:
```python
@ray.remote(num_cpus=1)
def execute_q…
-
After addressing #28, we are ready to launch datafusion-ray tests in a cluster in CI/CD on a Kubernetes cluster
-
**Summary**
```
CREATE TABLE IF NOT EXISTS nation (
n_nationkey INTEGER not null,
n_name STRING not null,
n_regionkey INTEGER not null,
n_comment STRING
);
COPY INTO nat…
-
In order to be able to run the TPC-H queries ([R versions here](https://github.com/voltrondata-labs/arrowbench/blob/main/R/tpch-queries.R)), the following tasks need to be completed:
**Function bin…
-
We're still using [SchemaReader](https://github.com/cmu-db/terrier/blob/master/util/include/execution/table_generator/schema_reader.h) for TPC-H benchmarks. Apart from the format being custom and the …
-
After encountering a suboptimal plan for query 6 (as described in #46677), I decided to run all TPC-H queries *without* automatic stats present and *with* automatic stats present. All queries ran on a…
-
### Problem description
This is in reference to a discussion @ritchie46 and I were having on discord regarding the TPC-H benchmarks.
The idea is the following,
- Implement all 22 TPC-H queries …
-
Our parquet performance is bad. I get 20MB/s in real-world use cases on the cloud where I would expect 500 MB/s. This accounts for ~80% of our runtime in complex dataframe queries in TPC-H. Systems…
-
I am using MacOs Catalina
/tpc-ds-datagen-to-aws-s3/tpc-ds/v2.11.0rc2/tools$ make …
-
### What is the problem the feature request solves?
We do not currently support RangePartitioning with native shuffle.
Adding this support would allow us to use native shuffle for more queries, in…