Closed universalmind303 closed 4 weeks ago
Some benchmarks using tpch scale 5 of "customer" table
Included polars
to give a point of reference.
# polars with projection
pl.scan_ndjson('./customer.json').select("c_mktsegment").collect()
# daft with projection
daft.read_json('./customer.json').select("c_mktsegment").collect()
# polars without projection
pl.scan_ndjson('./customer.json').collect()
# daft without projection
daft.read_json('./customer.json').collect()
# polars (with projection)
# 76.8 ms ± 1.88 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
# daft (with projection)
# 116 ms ± 1.39 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
# daft main (with projection)
# 181 ms ± 5.84 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
# polars (without projection)
# 89 ms ± 2.03 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
# daft (without projection)
# 169 ms ± 2.38 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
# daft main (without projection)
# 247 ms ± 6.92 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
assigning @clarkzinzow to take a look!
closes https://github.com/Eventual-Inc/Daft/issues/2196