Closed tobymao closed 2 years ago
i can add that in the future. execution is still in alpha
@mdboom do you need anything from me in order to land these changes?
@mdboom do you need anything from me in order to land these changes?
This looks fine to me, but I don't have merge rights. Maybe @ericsnowcurrently can have a look.
FWIW, there are some extra things to consider as we work on building good benchmark suites:
Relative to this benchmark specifically:
(I'm sure we'll merge it in regardless of the answers.)
Another thing to consider is that the sqlglot project should probably have this benchmark as part of its own suite (in its own repo), regardless of its inclusion in the pyperformance suite.
@ericsnowcurrently ready for another look.
and sure, i can add these benchmarks to the own suite
Relative to this benchmark specifically:
- it feels like an in-between one (not quite a macro-benchmark but more complex than a micro-benchmark)
- could it be be made represent a full Python workload more closely (or integrated into such a benchmark)?
- what workloads would it represent or be a part of?
- how much coverage of those workloads are already in the pyperformance suite?
- how should this benchmark be categorized/tagged?
(I'm sure we'll merge it in regardless of the answers.)
in terms of workflows, it represents a good chunk in that people want to parse many sql queries (data engineering / analytics). the normalizer also represents mutation of queries which is another kind of macro workflow. there are some companies that use sqlglot to parse 10s of thousands of sql queries to extract out metadata.
sqlglot has a prototype engine which could represent more macro workflows, but it's not quite ready yet and not something i want to expose at this point.
Thanks for the benchmark!
sqlglot is a pure python sql parser, transpiler, and optimizer