-
This epic is for improving shuffle / ScanExec performance.
## Issues
- https://github.com/apache/datafusion-comet/issues/1125
- https://github.com/apache/datafusion-comet/issues/1115
## Cont…
-
See performance numbers at https://www.notion.so/risingwave-labs/TPCH-Performance-Numbers-Table-e098ef82884546949333409f0513ada7?pvs=4#8de0bf4bda51444c8381f3b0c10ddfe1
- [ ] #15034
- [x] #14811
…
lmatz updated
5 months ago
-
## Describe the bug
This should be investigated to make sure results is correct/expeced. ODBC returns less precise result for two queries in comparison to Spark delta_lake connection
https://github.…
-
Hi,I paid attention to zksql when I was looking through the paper recently and reviewed TPC- H based on the code in the paper, I found that performing a join operation, which was not mentioned in the …
-
### Describe the bug
I am running a modified version of TPC-H query 10. I've removed the filters to stress comet and see how it behaves when processing large amount of data:
```sql
-- SQLBench-H …
-
**Is your feature request related to a problem?**
We need a comprehensive testing framework to validate **PPL ** commands in the **Spark** environment, ensuring that each new PPL (Spark) release me…
-
Create a new workload that is based on TPC-H
* I need to list up the changes to adapt to NoSE.
-
In TPC-H benchmark, Query 1 is as following.
``` sql
select
l_returnflag,
l_linestatus,
sum(l_quantity) as sum_qty,
sum(l_extendedprice) as sum_base_price,
sum(l_extendedprice*(1-l_discount)) as sum…
-
Reproduce the TPC-H work that @rjzamora has been looking into with Dask+cuDF and the TPC-H work Coiled has been doing with Dask but on Databricks.
https://tpch.coiled.io/
-
In the TPC-H benchmark, when creating 'lineitem' and 'orders' as distributed tables and the remaining tables as reference tables, the following queries are unsupported:
- Q13 outer join with re-par…