Open mrocklin opened 1 year ago
Queries 1-7 were available so I could just port them over without re-implementing anything.
It's on my todo list to look through the remaining queries to figure out if one of them stresses a different thing that we don't have covered yet
I can imagine this being helpful to see if we're over-tuning or not. If we find that we do really well on 1-7, but do really poorly on the rest, then that's a sign that our results aren't representative and that people shouldn't trust us.
On the other hand, if we implement the new queries and find that results are similar to what we've seen, then that's good evidence that the benchmarks we have are representative, at least to the class of queries represented by TPC-H.
I agree
@milesgranger if you have time tomorrow can I ask you to bring query 8 over from Polars into the system we have here?
I think @mrocklin also wanted a Dask implementation? That's what I understood when we chatted yesterday
Yeah, ideally we'd have coverage for all of the projects supported here, Dask, DuckDB, Polars, and Spark
On Tue, Oct 31, 2023, 5:37 AM Patrick Hoefler @.***> wrote:
I think @mrocklin https://github.com/mrocklin also wanted a Dask implementation? That's what I understood when we chatted yesterday
— Reply to this email directly, view it on GitHub https://github.com/coiled/benchmarks/issues/1071#issuecomment-1786950197, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACKZTDLRDR4F5QHJJIHG5LYCDIINAVCNFSM6AAAAAA6BJ7OC6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTOOBWHE2TAMJZG4 . You are receiving this because you were mentioned.Message ID: @.***>
bring query 8 over from Polars
I took that too literally. :sweat_smile:
This is a genuine question, not a suggestion or a request for work. Why did we choose to focus on these seven queries. Are they special in some way?
@phofl I suspect that this question is mostly for you.